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ABSTRACT 


Title  of  dissertation:  A  Study  of  the  3S  Ribosomal  RNAs  of  the 

Vi bri onaceae 

Michael  Terrell  MacDonel 1 .  Doctor  of  Philosophy,  1984 

Dissertation  directed  by:  Pita  R.  Colwell,  Professor  of  Microbiology 

The  University  of  Maryland,  College  Park 


The  sequence  of  the  120  nucleotide  bases  of  5S  ribosomal  RNA  <3S 
rRNA)  has  been  determined  for  each  of  23  eubacterial  strains  classified 
as  species  of  the  family  V ibricnaceae.  The  methods  employed  included 
the  resolution  of  limited  enzymatic  digests  of  end-labeled  RNAs  by  high 
voltage  polyacrylamide  gel  el  ectrophoresi  s,  as  well  as  thin-layer 
chr omat ogr aphy •  Several  of  the  5S  rRNAs,  for  which  the  primary 
structures  were  determined,  were  also  subjected  to  secondary  structure 
analysis,  employing  nuclease  SI,  which  hydrolyzes  regions  of  nucleic 
acids  not  participating  in  the  formation  of  helices,  followed  by 
resolution  of  the  partial  digests  on  thin  sequencing  gels. 

Sequence  data  obtained  from  this  study  were  compiled  and  analyzed, 
using  statistical  methods  and  gr oup-spec l f i c  signature  analyses,  for  the 
purpose  of  constructing  a  phylogenetic  taxonomy  of  the  Vi bri onaceae. 

Dot  matrix  maps  were  generated  for  5S  rRNA  sequences  determined  in  this 
study,  in  order  to  analyze  the  occurrence,  extent,  and  complexity  of 
palindromic  and  repeated  base  sequences.  Data  from  nuclease  SI 
analyses,  i.t-.,  observations  of  interactions  among  nucleotide  bases, 


were  used  1 n  the  evaluation  of  several  secondary  structure  models 
derived  from  most  probable  base-pairing  schema. 

Results  of  sequence  determinations  of  55  rRNAs  indicate  that  the 
family  V ibr 1 onaceae  is  heter ogenous.  as  presently  defined.  The  present 
genus  Vibrio  may  comprise  as  many  as  three  genera,  one  of  which  is 
composed  of  the  majority  of  named  Vibrio  species.  Results  of  dot  matrix 
analyses  indicate  that  the  55  rRNA  molecule  is  composed  of  several 
inverted  repeats  (palindromes).  Analyses  of  the  extent  of  degeneracy  of 
palindromic  sequences  suggests  that  palindrome  analysis  may  be  useful  in 
determining  the  extent  of  evolution  of  a  given  species,  with  respect  to 
others  in  the  V i br 1 onace ae .  Furthermore,  composites  of  palindromic 
sequences  shared  by  species  of  the  V i bn  onaceae  suggest  a  partial  base 
sequence  of  an  ancestral  5S  rRNA. 
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Table  1.  Bacterial  Strains  Employed  tor  Sequence  Analysis. 


Nomcncliturt 

StCAlQ 

AeroBonas  hydrophi 1  a 

9071 

ATCC 1 

fterowonas  Media 

33907 

ATCC 

PI  teroBonas  put r i f aci ev  < 

8071 

ATCC 

Escherichia  col  1 

MR£  600 

PL  Biochemicals 

PhctcbacteriuB  angustuB 

25913 

ATCC 

Phot  obact  en  ub  1 e i ognat  hi 

25321 

ATCC 

Photobac  ten  ub  1  ogei 

15302 

ATCC 

PlesiOBonas  shi gel lo ides 

14029 

ATCC 

Vi  br  10 

al ginol yt ic us 

17749 

ATCC 

Vibrio 

a r>  g  hi  1 1  a r  ub 

19264 
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Vi  bn  o 

carchariae 

35084 
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lJi  br  io 

Choi erae 

14033 

ATCC 

E  3498 

J.P.  Craig 

Vi  br  i  o 

9 ci nc \ nnat i 9 

vp  c  1  n 

R.B.  Bode 

Vibrio 

daBsel a 

33539 

ATCC 

V  i  bn  o 

di azotrophicus 

33466 

ATCC 

V i hr  20 

f l scheri 

7744 

ATCC 

Vibrio 

i 1 uvi a  1 i s 

NCTC*  11328 

J.V.  Lee 

Vi  bn  o 

gazogenes 

19900 

ATCC 

Vi  bn  o 

Bar  in  us  <t1P-l) 

15301 

ATCC 

Vi  bn  o 

Bet  schn i kov i i 

7708 
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Vi br i o 

B1B1CUS 
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ATCC 

Vi  brio 

natn  egens 
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ATCC 

V:  bn  o 

par  ahaeao 1  yticus 

17802 

ATCC 

Vibrio 

psychroerythrus 

27364 

ATCC 

Vibrio 

vulni  f ic us 

27562 

ATCC 

Key:  1 

American  Type  Culture 

Collection,  Rockville. 

MD 

2 

National  Collection  of 

Type  Cultures,  London, 

UK 

AMD  METHODS 


A.  BACTERIAL  STRAINS 

All  bacterial  strains  (see  Table  1)  employed  in  the  study  were 
purchased  from  The  American  Type  Culture  Collection  (ATCC,  Rockville, 
flD),  with  the  following  exceptions:  V •  zhclerae  strain  E-Q498  was 
provided  by  J.P.  Craig,  Vibrio  " c l nci nnati  1 M  was  provided  by  P.B.  Bode, 
3nd  V.  Iluviahs  strain  MCTC  11329  (ATCC  33B12)  was  provided  by  J.V. 
Lee. 

B.  BACTERIAL  CULTURE  MEDIA 

A  broth  medium  consisting  of  equal  parts  of  Marine  Broth  2216 
'Difco  Labor  at  or i es,  Detroit,  MI)  and  Tryptic  Soy  Broth  (Difco)  was 
used  for  the  batch  cultivation  of  marine  and  estuarine  strains. 

Tryptic  Soy  Broth  (Difco)  was  used  for  the  cultivation  of 

non-sal t -requi ri ng  bacterial  strains.  Psychrophi 1 i c  marine  strain  V. 

psycbrce rythrus  ATCC  27364)  was  grown  in  Marine  Broth  2216  (Difco). 

C.  RNA  EXTRACTION 

Washed,  pelleted  bacterial  cells  from  appro::  l  matel  y  700  ml  of  an 
exponential  phase  broth  culture  were  found  to  be  adequate  for  recovery 
of  adequate  quantities  of  5S  rRNA  species.  When  necessary, 
Gram-negative  bacterial  cells  were  lysed  using  the  freere-thaw 
technique  (Zablen  et  ai,,  lQ73).  Pesuspension  of  the  washed  cell 
pellet  in  Tr i s-b or  at e-EDTA  ' TBE '  buffer  usually  resulted  in  immediate 
cell  lysis,  eliminating  the  need  to  continue  with  freezing  and  thawing 


consequence  of  a  polythetic  approach  that  reliance  on  a  priori 
assumptions  in  the  development  of  a  taxonomy  of  the  <J 2  br  1  onac eae  was 
minimized.  The  power  of  polyphasic  taxonomy,  in  the  case  of  the 
<Ji  br  lontcest  f  in  providing  a  natural  classification  is  that  it  allows 
1 ncorporat i on  of  new  dimensions  as  the  science  of  systematica  advances. 
Thus,  the  ability  to  elucidate  phylogenetic  char acter 1 st i cs,  J.e.f 
char acter 1 st i cs  whose  possession  or  absence  constitutes  a  phylogenetic 
marker,  can  now  be  incorporated  into  an  expanded  polyphasic  taxonomy  for 
the  <J  i  br  ior>aceae .  Even  more  attractive  is  that  having  a  phylogenetic 
basis  for  the  family  structure,  it  is  possible  to  select,  a  posteriori , 
those  phenotypic  characters  embedded  in  the  evolutionary  history  of  the 
organisms.  Selection  of  such  characters  is  vital  because  workers  in  the 
field  must  rely  on  a  determinative  scheme  for  identification  of  bacterial 
taxa,  especially  when  confronting  new,  previously  unidentified  species. 

The  objective  of  this  study,  therefore,  was  to  construct  a 
classification  of  the  V l bri onaceae ,  using  comparative  sequencing  of  the 
5S  rRNAs  as  a  means  of  elucidating  the  phylogeny  of  the  family.  In 
addition,  the  5S  rRNA  molecules  were  analyzed  for:  relationships  between 
primary  (and  secondary)  structures  and  environmental  parameters; 
evolutionary  markers  in  the  primary  structures;  and  information  of 
evolutionary  significance  at  the  level  of  the  genetic  transcript,  viz., 
conservation  of  reading  frames,  repeated  sequences,  inverted  repeats,  and 
conservation  of  hairpin  loops  and  helices. 


fixation  and  ribosomal  RNA  sequence  determinations  have  shown  that  the 
Vibnonaceae  and  Enter  obacten  aceae  share  a  relatively  common  evolution 
(Baumann  et  al . .  1984;  Baumann  and  Schubert,  1984).  The  proximity  in 
phylogeny  of  the  two  families  is  reflected  in  the  fact  that,  taken 
together,  they  contribute  more  than  ninety  percent  of  the  species 
comprising  RNA  superfamily  I,  one  of  at  least  five  superf ami  1 i es  of  the 
eubacterial  kingdom  (De  Vos  and  De  Ley,  1983). 

In  contrast  to  their  relative  similarity,  the  identification  schemes 
for  the  V i br ionaceae  and  Enterobacter i aceae  is  strikingly  different,  due, 
in  part,  to  the  influence  of  clinical  microbiology  on  the  taxonomy  of  the 
Enterobacter i aceae,  resulting  in  “over cl assi f i cati on “  of  species  and 
genera.  In  fact,  the  phylogenetic  depth  of  the  entire  family 
Enterobacter i aceae  is  less  than  that  of  the  genus  Vibrio,  suggesting  that 
the  genera  of  the  Enterobacteri aceae,  as  presently  defined,  more  closely 
approximate  species,  on  a  basis  of  phylogenetic  relationships  (MacDonell 
and  Colwell,  1984b). 

The  Vibnonaceae,  as  presently  defined,  consists  of  the  genera 
< Jibrio  (28  species),  Peroaonas  (4  species),  Photobacteri ua  (3  species) 
and  PI esiomonas  (1  species)  (Baumann  and  Schubert,  1984),  all  of  which 
are  associated  with  either  aquatic  or  marine  environments.  Whereas,  the 
taxonomy  of  the  Enterobacter  2  aceae  was  heavily  influenced  by  clinical 
criteria,  the  taxonomy  of  the  Vi br i onaceae,  as  presently  defined,  derives 
largely  from  a  polyphasic  approach  (Citerella  and  Colwell,  1970;  Colwell, 

1 970 |  West  and  Colwell,  1904).  Using  this  approach,  dependence  upon  key 
characteristics  was  abandoned,  and  instead,  a  very  wide  range  of  equally 
weighted  characters,  both  phenetic  and  genetic,  were  employed  in  order  to 
generate  clusters  (species)  of  related  strains  (Colwell,  1968).  It  was  a 


helical  regions,  there  is  a  tendency  i or  the  base  sequence  to  be  much 
less  highly  conserved.  In  fact,  the  rate  of  mutation  in  these 
hypervar 1 abl e  regions  is  approximately  twice  that  of  conserved  regions 
(MacDonell  and  Colwell,  1984a).  That  two  significantly  different  rates 
of  mutation  exist  in  5S  rRNA  sequences  provides  a  coarse  and  fine  focus 
for  i nterpretati on  of  sequence  comparisons.  Compilations  of  5S  rRNA 
sequences  indicate  that  sequence  differences  occur  in  Mhyper var i abl e " 
regions  down  to,  and  possibly  including,  species,  indicating 
applicability  to  phylogenetic  inferences  within  given  genera.  Sequence 
differences  in  the  more  highly  conserved  regions,  however,  appear  to 
occur  only  at  the  genus  and  family  level.  Thus,  the  different  mutation 
rates  allow  an  extension  of  the  range  of  taxonomic  levels  for  which 
sequences  of  5S  rRNA  can  be  compared.  At  present,  the  total  published 
library  of  bacterial  5S  rRNA  sequences  comprises  ca.  50  in  number. 

C.  THE  FAMILY  VIBRIQNACEAE 

In  1965,  Veron  proposed  the  family  Vibrionaceae  for  non-enteric, 
Gram-negati ve  rods,  and  suggested  two  major  criteria  for  differentiation 
of  these  strains  from  those  of  the  family  Enterobacter i aceae:  (1) 
possession  of  a  cytochrome  oxidase;  and  (2)  motility  by  means  of  a  single 
polar  flagellum  (Veron,  1975).  The  classification  scheme  of  Veron  (1965) 
was  not  designed  to  reflect  phylogenetic  relationships,  rather  it  was 
proposed  for  convenience  in  separating  the  two  groups.  It  is  interesting 
to  note  that,  although  the  species  assigned  to  both  families  have 
undergone  revision,  it  has  been  extensive  in  the  case  of  the  Vibrionaceae 
(see  Baumann  et  a/.,  1980).  Comparative  studies  focussing  on  bacterial 

evolution,  employing  new  methods,  such  as  quantitative  microcomplement 
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(1580  bases  compared  with  120  in  5 S  rPNAs)  is  lost  because  of  the  need  to 


resort  to  oligomer  catalogs.  At  this  writing,  only  seventeen  complete 
16S  rRNA  sequences  are  known  to  e:nst,  although  more  than  4000  16S  rRNA 
oligomer  catalogs  have  been  constructed  ( A.  BBck,  personal 
communication) . 

Although  £,  col i  5S  rRNA  was  one  of  the  first  nucleic  acid  molecules 
for  which  the  complete  nucleotide  base  sequence  was  determined  (Brownlee 
et  ai,,  1967),  its  secondary  structure  is  still  a  subject  of  controversy. 
Application  of  rules  derived  from  more  than  a  decade  of  research  on  the 
primary  and  secondary  structure  of  tRNA  (Tinoco  et  a/.,  1971;  Ninio, 

1979)  has  provided  several  structures,  all  of  which  are  consistent  with 
the  known  physical  data,  but  none  has  gained  acceptance  as  the 
representati on  of  spatial  conf ormati on.  The  concept  suggests  that  the 
role  of  5S  rRNA  in  the  SOS  ribosomal  subunit  may  be  modulatory,  thus,  it 
may  well  involve  switching  between  two  (or  more)  conformations. 

It  is  now  reasonably  well  established  that  comparisons  among  5S 
rRNAs  provide  a  rather  firm  basis  for  evaluating  evolutionary  relatedness 
among  bacterial  species  (Luehrsen  and  Fox,  1981;  Dekio  et  al ,,  1984; 
MacDonell  and  Colwell,  1984b;  Bogin  et  ai#,  1972).  In  fact,  5S  rRNAs 
represent  ideal  material  for  sequence  determination  since  they  are  easily 
isolated  and  purified  to  homogeneity  (MacDonell  and  Hansen,  1985),  are 
small  enough  to  be  sequenced  in  their  entirety,  and  appear  to  lack 
post-transcr i pti onal 1 y  modified  bases  (Luehrsen  and  Fox,  1981?  MacDonell 
and  Colwell,  I984d).  With  regard  to  comparative  sequencing,  5S  rRNAs 
consist  of  two  qualitatively  different  regions.  In  one,  that  generally 
associated  with  single-stranded  portions  of  the  molecule,  the  nucleotide 
base  sequence  is  highly  conserved.  In  the  other,  usually  associated  with 
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*hich  consists  of  27S,  165  and  5 S  rDNA,  along  with  sequences  coding  for 
one  or  more  tPNAs  < Lund  et  al.,  1979).  Pioneering  work  in  the  field  cf 
rRNA  sequence  analysis  involved  comparisons  among  catalogs  of  oligomers 
prepared  from  ribonuclease  T1  digests  of  ribosomal  PNAs  <Sogin  et  al . , 
1972).  This  method  gave  rise  to  the  employment  of  two-enzyme  comparative 
catalogs,  from  which  ribonucleotide  base  sequences  could  be  inferred  with 
a  relatively  high  degree  of  accuracy  (Uchida  et  al . ,  1974).  Chemical 
sequencing  methods  for  RNA  were  developed  and  improved  to  the  point  where 
the  primary  structures  of  rRNA  molecules  could  be  determined  both 
routinely  and  unambiguously  (Peattie,  1979).  Enzymatic  methods 
( Don i s~Kel  1  er  et  al . ,  1979;  Doni s-Kel 1 er ,  I960)  were  slower  in 
development  and  were  more  restricted  in  range,  since  they  could  not  be 
employed  with  either  tPNAs  or  eukaryotic  RNAs  in  which  substantial 
post-transcriptional  modification  of  nucleotide  bases  occur.  Prokaryotic 
5S  rRNAs,  however,  provide  ideal  substrates  for  enzymatic  methods,  and 
can  be  sequenced  unambiguously  using  established  techniques  (MacDonell 
and  Colwell,  l?84d).  It  is  still  considered  impractical,  in  terms  of 
both  time  and  material,  to  sequence  large  ribosomal  RNAs.  Although 
oligomer  catalogs  from  digests  of  the  relatively  small  55  rRNAs  are 
virtually  useless,  the  sequence  of  its  120  nucleotide  bases  can  be 
determined  readily.  5S  rRNA  studies,  therefore,  generally  employ 
sequence  determinations,  whereas  16S  rRNA  studies  still  depend  on 
oligomer  cataloging.  Comparisons  among  populations  of  oligomers, 
however,  do  not  allow  the  same  resolution  as  comparisons  among  sequences, 
since  similar  oligomer  populations  do  not  necessarily  reflect  unique 
sequences.  It  is  frustrating,  therefore,  that  in  the  construction  of 
evolutionary  trees,  much  of  the  potential  resolution  offered  by  16S  rRNAs 


phosphatase,  superoxide  dismutase,  and  glutamine  synthetase  (Bang  et  a} . , 
1981;  Baumann  et  al . ,  1980;  Baumann  et  ai.,  19B3;  Wool  kalis  and  Baumann, 
1981).  Comparative  sequencing  of  ribosomal  RNAs,  however,  remains 
unrivaled  as  a  method  for  evaluating  phylogenetic  relationships. 

B.  RIBOSOMAL  RNA  AND  EVOLUTION 

The  ribosome  is  believed  to  have  evolved  in  three  stages:  (1) 
establishment  of  a  relatively  simple  archetypal  mechanism,  along  with  a 
primitive  set  of  codon  assignments;  (2)  increase  in  complexity  of  the 
translation  mechanism,  in  which  the  codon  assignments  assumed  their 
present  form;  and  (3)  increase  in  the  efficiency,  i.e.,  rapidity  and 
precision  o 4  the  process  of  translation  (Woese,  1970).  Allowing  that,  in 
the  process  of  evolution,  central  features  of  primitive  processes  remain 
the  central  features  of  their  evolved  counterparts,  it  is  reasonable  to 
expect  the  modern  bacterial  translation  apparatus  to  reflect 
characteristics  of  its  evolutionary  predecessor  (Fox  and  Woese,  1975). 
Furthermore,  since  it  can  be  shown  that  ribosomal  RNAs  are  very  highly 
conserved,  they  should  reflect  the  base  composition,  and  to  some  extent, 
the  base  sequence  of  the  primitive  rDNA  cistron,  and,  therefore,  the 
ancestral  genome.  Similarly,  primary  structures  of  ribosomal  components, 
particularly  the  ribosomal  RNAs,  provide  unique  insights  phylogenetic 
relationships  among  bacterial  taxa,  as  well  as  into  the  nature  of 
bi ochemi cal  evol uti on. 

Ribosomal  RNAs  comprise  approximately  807.  of  the  total  bacterial 
RNA,  and  are  processed  post-tr anscri pt i onal 1 y  from  a  single  large  (30S) 

PNA  polymer  (Dunn  and  Studier,  1973:  Nikolaev  et  a/.,  1973).  This  single 
transcript  is  coded  for  by  the  "ribosomal  RNA  t r anscr l p t i on  unit*  (rRTU) , 


compositions  provided  an  interesting  contrast  to  classically  derived 
taxonomic  schema  (Colwell  and  ttandel ,  1964;  Thornley,  1967),  and  the 
impact  of  DNA  base  composition  on  bacterial  systematic®  has  been  such 
that  it  is  now  a  requirement  for  the  minimum  description  of  a  new 
bacterial  species  by  the  International  Committee  of  Systematic 
Bact er i  ol  ogy. 

Within  a  decade  after  the  advent  of  DNA  base  composition  analysis, 
DNA/DNA  hybr i di rati  on  methods  significantly  extended  the  sensitivity  with 
which  the  bacterial  genome  could  be  probed,  and  provided  a  means  by  which 
the  primary  genetic  structures  of  two  distinct  genomes  could  be  compared 
directly,  permitting  inference  of  taxonomic  relationships  based  on 
phylogeny,  and  in  which  phylogenetic  distances  between  closely  related 
strains  could  be  quantitated  by  extent  of  DNA  homology  shared  by  the 
strains. 

More  recently,  the  ability  to  routinely  determine  sequences  of 
nucleotide  bases  in  genetic  material  (Sanger  et  al  . ,  1977;  Donis-Kel ler , 
1979;  Peattie,  1979;  Maxam  and  Gilbert,  1980;  MacDonell  and  Colwell, 

1984d)  has  elevated  the  sophistication  of  nucleic  acid  methods.  Large 
quantities  of  phylogenetic  information  can  be  obtained,  stored  on  the 
computer,  and  retrieved  permitting  sequence  data  comparisons  as  the 
sequences  are  determined,  i.e,,  on-line,  if  desired. 

The  introduction  of  comparative  sequencing,  and  to  a  lesser  extent, 
immunologic  comparisons,  of  key  biological  molecules  has  allowed 
construction  of  geneologies  of  species.  Biological  molecules  used  for 
tracing  of  natural  relationships  include  quinones  (Collins  and  Jones, 
1979),  cytochromes  (Schleifer  et  al . ,  1902),  ferridoxxns  (Schwartz  and 


Dayhoff.  1978)  and  mx cr ocomp 1 ement  fixation  techniques  employing  alkaline 


Woese  et  s 1 


1984).  The  authority  reflected  in  the  existing  system  of 


nomenclature,  even  though  not  warrented,  suggests  the  existence  of 
natural  relationships,  and,  by  implication,  a  phylogeny,  however 
speculative. 

We  eight  consider  the  period  between  the  biochemical  taxonoMy  of 
Orla-Jensen  <1909),  and  the  publication  by  Fox  et  si.  of  The  Phylogeny  of 
the  Proksryo tes  (1981),  as  the  determinative  era  in  microbial 
systematic*.  It  was  during  the  last  two  decades  of  this  era  that  the 
enormous  advances  in  molecular  biology  occurred  which  brought  the 
construction  of  a  natural  taxonomy  of  bacterial  species  closer  to 
reality.  Methods  for  nucleic  acid  analysis,  including  DNA/DNA  and 
DNA/RNA  hybridization,  oligomer  cataloging,  and  nucleic  acid  sequencing, 
have  attained  some  prominence  in  microbial  taxonomy  in  recent  years, 
although  the  significance  of  some  results  obtained  using  these  methods 
are  open  to  interpretation.  Nevertheless,  the  newer  methods  permit 
analysis,  in  great  detail,  of  the  molecular  genetic  structure,  from  which 
direct  evidence  of  natural  rel at i onshi ps  among  bacterial  species  can  be 
deduced. 

The  first  of  the  new  generation  of  methods  to  emerge  was  the 
determination  of  bacterial  DNA  base  composition  (Lee  et  si ,  1956; 
Belozersky  and  Spirin,  1960),  which  provided  a  direct,  although  crude, 
probe  into  the  bacterial  genome.  Results  of  comparisons  of  "base  ratio" 
determi nat ions  clearly  demonstrated  the  significance  of  G+C  molar  ratios 
in  bacterial  taxonomy  (Colwell  and  Mandel,  1964;  Hill,  1966;  MacDonell 
and  Colwell,  1984f),  and  even  now  provide  a  powerful  and  routine  means  of 
confirming  relatedness  (or  the  lack  of  relatedness)  among  phenetically 
similar  strains.  The  earliest  compilations  of  bacterial  base 
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waning  need  to  establish  a  bacterial  phylogeny  was  all  the  more  insidious 
since  several  generations  of  microbiologists  have  had  to  satisfy 
themselves  with  a  taxonomy  in  a  constant  state  of  flux.  For  these 
microbiologists,  also,  the  elucidation  of  natural  relationships  among 
bacterial  species  appeared  to  be  an  intractable  problem,  if  not  a  myth. 

A  result  is  that  many  microbiologists  have  regarded  the  construction  of  a 
phylogenetic  taxonomy  of  bacterial  species  as  a  game  of  speculation  and 
an  activity  irrelevant  to  the  practical  aspects  of  microbiology. 

Interestingly,  a  priori  determinative  approaches  to  bacterial 
taxonomy,  a  school  of  thought  for  which  the  later  editions  of  Bergey's 
manual  might  be  regarded  as  a  cornerstone,  appeared  to  be  reasonably 
successful.  Determinative  approaches,  indeed,  provided  a  practical  basis 
for  allocation  of  bacterial  strains  to  taxa,  although  at  the  expense  of 
regarding  phylogenetic  relationships  as  secondary  in  importance.  It  is 
ironic  that,  in  an  atmosphere  of  skepticism  and  suspicion  of  putative 
phylogenetic  treatments,  a  level  of  skill  and  technology  has  been 
acheived  that  is  sufficient  to  determine  natural,  i.e.,  phylogenetic 
relationships  with  relative  ease. 

Borne  largely  out  of  clinical  consi derat i ons  and  reinforced  by 
successive  editions  of  Bergey’s  manual,  an  exaggerated  emphasis  on  the 
identification  of  bacteria  by  means  of  a  minimum  number  of  key 
characteristics,  i.e.,  a  deter  mi nati ve  approach,  has  promoted  a  false 
confidence  that  certain  "key"  phenetic  characters  are  a  priori  sufficient 
for  the  definition  of  natural  taxa.  Unfortunately,  such  practice  has 
been  reinforced  by  the  use  of  Latin  and  Greek  nomenclature,  with 
associated  rules,  reserved,  by  convention,  for  classifications  which 
reflect  true  phylogenetic  relationships  (St ackebrandt  and  Woese,  1904j 


INTRODUCTION 


A.  HISTORIC  APPROACHES  TO  BACTERIAL  SYSTEMATICS 

The  need  to  define  natural  relationships  among  bacterial  species  has 
dominated  classical  microbial  systematica  since  its  beginnings  more  than 
a  century  ago.  Indeed,  numerous  systems  of  taxonomy  purporting  to 
reflect  a  natural  order  have  been  proposed  (Cohn,  1072;  Bergonzini,  1079; 
Orla-Jensen,  1909),  in  general  based  on  either  morphological  or 
physiological  criteria.  Despite  transient  popularity,  none  of  these 
systeas,  founded  necessarily  on  purely  phenetic  traits,  can  be  said  to 
represent  the  true  natural  order.  Bisset  (1950,  1952,  1957)  attempted  an 
ambitious  bacterial  taxonomy  based  on  what  he  speculated  to  be 
phylogenetic  criteria,  but  was  unsuccessful  since  there  existed  at  that 
time  no  basis  for  evaluating  whether  or  not  a  feature  was  a  phylogenetic 
characteri sti c.  Thus,  the  "bacterial  phylogeny"  of  Bisset  (1950,  1952, 
1957)  was,  itself,  speculative  (Sneath,  1962).  The  failure  to  achieve  a 
phylogenetic  taxonomy  is  not  surprising  since  methods  involving  direct 
genetic  comparison  necessary  for  <i)  detection  of  true  evolutionary 
relationships;  and  (ii)  identification  of  valid  phylogenetic 
characteristics,  have  only  recently  become  available. 

Over  the  past  several  decades  bacterial  taxonomy  has  undergone 
frequent  and  extensive  rearr angement .  Unf ortunatel y ,  there  was  also  a 
slow  but  significant  loss  of  interest  in  the  historic  search  for  a 
natural  classification  of  the  bacteria.  Indeed,  some  microbiologists 
lost  confidence  in  the  belief  that  a  natural  classification  of  bacterial 
species  could  be  constructed  (Shimwell  and  Carr,  1960;  Cowan,  1962).  The 
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the  cel  1  •  and  significantly  reducing  exposure  of  FMAs  to  endogenous 

PM Ac ec  and  to  lvscryme,  a  possible  source  of  foreign  PNases.  Cell 
lysis  was  evidenced  by  a  rapid  and  marked  increase  in  viscosity  (as 
well  as  a  noticafcle  clearing)  of  the  cell  suspension. 

The  method  of  Kirby  (1956) •  with  t^e  modifications  discussed 
below,  was  used  for  extracting  5S  rRMA.  In  the  first  step,  i.e*,  crude 
nucleic  acid  extraction,  the  crude  cell  lysate  was  brought  to  lM  NaCl 
and  vortexed  with  an  equal  volume  of  chloroform  for  10  minutes.  The 
aqueous  and  organic  phases  were  separated  by  centrifugation  at  12,000  x 
G  for  15  minutes  at  4  C.  Substitution  of  chloroform  for  phenol  in  the 
first  extraction  step  increased  yields  of  the  total  nucleic  acid 
fraction  as  much  as  502  (MacDonell  and  Hansen,  1985).  Since  chloroform 
and  phenol  remove  proteins  from  aqueous  solutions  by  different 
mechanisms,  virtually  all  chlorof orm-precipi table  proteinaceous 
material  was  removed  with  a  single  chloroform  extraction.  The  aqueous 
(upper)  phase,  containing  the  nucleic  acid  fraction,  was  removed  from 
underneath  with  a  hook-shaped  pasteur  pipette.  The  aqueous  phase  was 
precipitated  with  2  volumes  of  cold  (-80  C)  absolute  ethanol  and  mixed 
several  times  by  inversion,  after  which  it  was  placed  on  dry  ice  for  10 
minutes.  After  chilling,  it  was  collected  by  cent r i f ugat i on  (12,000  x 
G  for  10  minutes  at  4  C)  ,  and  resuspended  in  sterile  50  mfl  TPE  buffer. 
The  nucleic  acid  pellet  was  re-extracted,  using  a  phenol  solution  of 
the  following  composition:  797.  <w/v)  phenol  (redistilled),  117.  (w/v) 
m-cresol  (redistilled),  107.  (v/v)  TPE  buffer,  and  0.057.  (w/v) 
8-hydroxyquinol  me.  Af'er  collection,  the  aqueous  phase  was 
precipitated  with  2  volumes  of  cold  (-90  C)  absolute  ethanol,  mixed 


several  times  by  inversion  and  chilled  on  dry  ice  for  10  minutes.  The 
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total  nucleic  acid  (NA)  fraction  was  collected  by  cent r 1 f uga 1 1  on  as 


described  above.  After  the  alcoholic  supernatant  had  been  decanted, 
the  nucleic  acid  pellet  was  dried  partially  under  vacuum  for  several 
minutes  in  order  to  remove  excess  ethanol.  The  nucleic  acid  was 
resuspended  in  as  small  a  volume  of  sterile  distilled  water  as  would 
permit  complete  dissolution,  generally  10  to  20  ml. 

D.  ISOLATION  OF  RNA 

To  the  aqueous  nucleic  acid  solution  was  added  an  equal  volume  of 
redistilled  2-methoxyethanol ,  2.5  M  K2HPCU;  and  0.05  volume  of  37.  (v/v) 
H3P0*.  This  was  mixed  vigorously  for  10  minutes  and  separated  by 
centri  f  ugati  on  for  10  minutes  at  12,000  >:  6  at  4  C.  The  aqueous 
(upper)  phase,  containing  the  total  RNA  fraction,  was  collected,  and 
precipitated  by  addition  of  2  volumes  of  cold  ethanol  followed  by 
chilling  on  dry  ice,  as  described  above.  Prior  to  collection  by 
centrifugation,  a  substantial  quantity  of  fluffy  white  RNA,  occupying 
as  much  as  1/3  of  the  volume  of  the  centrifuge  tube,  was  usually 
evident.  The  RNA  pellet  was  dissolved  in  a  minimum  volume  (a  few 
milliliters)  of  autoclave-sterilized  TBE  buffer. 
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E.  DEAE-CELLULOSE  CHROMATOGRAPHY 

The  small  oligomer  fraction,  representing  a  significant  proportion 
of  the  crude  FNA  extract,  was  removed  before  fractionation  and 
purification  of  5S  rRNA  by  polyacrylamide  gel  el ec trophoresi s.  The 
fine  cellulose  particles  were  removed  (Peterson,  1900)  from 
appr ox i matel y  1  gram  of  washed  and  TBE-equi 1 l br ated  DEAE-cel 1 ul ose 
(Cellex-D.  Bio  Rad,  Richmond,  CA).  The  prepared  DEAE-cel 1 ul ose  was 


packed  in  a  5  ml  sterile  plastic  syringe  barrel,  to  a  depth  of  5  to  6 
cm.  The  crude  RNA  solution  was  then  applied  to  the  column  and  was 
rinsed  with  ten  column  volumes  of  TBE  buffer.  A  rinse  with  0.2  M  NaCl 
in  TBE  buffer  was  sufficient  to  remove  small  oligomers  (less  than  about 
30  bases).  The  fraction  containing  3S  rRNA  was  eluted  with  1  M  NaCl,  7 
M  urea  in  50  mM  TBE  buffer  (Hansen,  1901),  and  precipitated  by  addition 
of  2  volumes  of  cold  absolute  ethanol  and  chilling  on  dry  ice  for  10 
minutes.  The  precipitated  RNA  was  collected  by  centrifugation,  as 
described  above,  and  suspended  in  a  tracking  dye-buffer  of  the 
following  composition:  B  M  urea.  0.057.  xylene  cyanol  and  0.  05X 
brom-phenal  blue  in  TBE  buffer. 

F.  PURIFICATION  OF  3S  rRNA 

5S  rRNA  was  isolated  and  purified  using  5X  (w/v)  acrylamide 
preparative  stacking  gels,  in  which  the  upper-  and  lowermost  stacks 
(approximately  257.  of  the  length)  consisted  of  acrylamide/ 
bi sacryl ami de,  and  the  center  stack  (middle  50X>  consisted  of 
acrylamide/bis-  acryl y 1 cystami ne  (BAC,  Bio  Rad,  Richmond,  CA) . 
Substitution  of  bis-acrylylcystamine,  a  thi ol -sol ubl e  cross-linking 
agent,  for  N, Nf -methyl ene-bi s-acryl ami de  produces  a  soluble  acrylamide 
gel  (Hansen,  1976;  Hansen  et  al . ,  19B0;  Hansen,  19B1).  The  use  of 
conventional  (bis)  acrylamide  in  the  upper  stack  facilitates  the 
formation  of  clean,  square-bottomed  slots  (since  5X  BAC  is  somewhat 
sticky  and  tends  to  adhere  to  slot  formers) .  After  casting  the  first 
(lowest)  stack,  the  gel  form  (along  with  the  degassed,  uncatalyzed 
BAC-acry 1  ami de)  was  placed  in  an  incubator  set  at  42-45  C  and  left  to 
equilibrate  for  approximately  an  hour.  It  was  necessary  to  warm  the 


gel  form  thoroughly,  prior  to  pouring  the  gel,  since  failure  to  do  so 
was  found  to  favor  the  formation  of  insoluble  ON  bonds,  a  consequence 
of  reduced  temperature  during  pol ymerizati on.  After  completion  of 
polymerization,  the  BAOacryl ami de  Has  poured  to  within  3  or  4  cm  from 
the  top  of  the  gel  form  and  polymerization  allowed  to  proceed  for  60 
minutes  at  42  C  to  45  C.  After  removal  of  the  gel  form  from  the 
incubator,  the  remaining  b i s-acryl ami de  solution  was  degassed, 
initiated  and  poured.  A  large  5-tooth  preparative  slot  former  was 
inserted.  Gels  (150  mm  x  175  mm  x  4  mm)  were  pre-el ectrophoresed  for 
at  least  30  minutes  at  4  to  5  W  (constant  power),  j.e.,  ca.  20  to  25 
mA.  After  pre-el ectrophor esi s,  the  sample  wells  were  loaded  with  50  to 
100  microliters  of  RNA/tracking  dye  solution  and  el ectrophoresed  at  7  W 
(constant  power),  i.e.9  ce.  40  mA.  The  current  was  adjusted  so  that 
the  glass  plates  or  the  gel  form  reached  55  C  to  65  C,  in  order  to 
inhibit  formation  of  secondary  structure.  Electrophoresis  of  the  RNA 
solution  was  continued  until  the  brom-phenol  blue  band  began  to  exit 
the  BAC-acryl ami de  gel  stack,  after  which  the  gel  was  removed,  stained 
for  30  minutes  with  a  1  ug/ml  aqueous  solution  of  ethidium  bromide  in 
distilled  water,  and  viewed  on  an  ultraviolet  (short  wavelength) 
transilluminatar ,  or  alternatively,  imaged  by  UV-shadowing  (Hassur  and 
Whitlock,  1974). 

The  5S  rRNA  band,  located  about  857.  of  the  distance  from  the 
brom-phenol  blue  band  to  the  xylene  cyanol  band,  was  excised  with  a 
sterile  blade  (see  Carmichael,  1980;  Hecht  and  Woese,  1968;  Loening, 
1967).  The  excised  gel  plugs  were  placed  into  sterile  siliconized 
glass  test  tubes,  and  solubilized  with  5  ml  of  sterile  TBE  buffer  and 
75  ul  of  2-mercaptoethanol  (Hansen,  1981). 
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G.  RECOVERY  OF  RNA  FROM  LIQUIFIED  GEL 

DEAE-cel 1 ul ose  columns  were  prepared  as  follows.  A  small  plug  of 
glass  wool  was  aseptically  placed  in  the  bottom  of  a  long  tip  pasteur 
pipette.  Both  the  glass  wool  and  the  pasteur  pipettes  were  silanized 
and  autoclave-sterilized.  To  this  was  added  sufficient  DEAE-cel 1 ul ose 
slurry  to  form  a  bed  of  about  1  cm  in  height.  This  was  pre-rinsed  with 

1  to  2  ml  of  the  final  elution  buffer  consisting  of  1  M  NaCl ,  7  M  urea 
in  SO  mM  TBE  buffer  followed  by  equilibration  with  5  ml  of  fresh  TBE 
buffer.  The  solution  containing  RNA  and  solubilized  gel  was  added  to 
the  column  and  rinsed  with  2  ml  of  fresh  sterile  TBE  buffer  followed  by 

2  ml  of  0.2  M  NaCl  in  TBE  buffer.  The  5S  rRNA  fraction  was  eluted  with 
500  ul  of  1  M  NaCl  in  TBE,  added  in  aliquots  of  100  ul  (Hansen,  19B1). 
5S  rRNA  thus  recovered  was  found  to  be  of  sufficient  purity  for 
sequence  analysis,  i.e.,  >99X  pure. 

H.  SILANIZING  GLASS  AND  PLASTIC  WARE 

All  glassware  used  in  the  isolation  and  purification  of  RNAs  was 
silanized  (Schlief  and  Wensink,  1981),  and  rinsed  in  reagent  grade 
water  (Milli-Q  reagent  water  system,  Millipore  Inc.,  Bedford,  HA). 
Glassware  was  baked  at  300  C  for  4  hours.  Plastic  micropipette  tips 
and  microfuge  tubes  were  silanized  as  follows.  npprox i matel y  2  ml  of  a 
Is  1  ratio  of  chloroform  and  dimethyl,  di chi  or osi 1 ane  were  pipetted  onto 
a  watch  glass  placed  in  the  bottom  of  a  glass  vacuum  desiccator  from 
which  the  desiccant  had  been  removed.  Tubes  and  pipette  tips  to  be 
silanized  were  distributed  inside,  and  vacuum  was  applied  for  about  10 
seconds  before  being  released  abruptly.  This  process  was  repeated  10 
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to  15  times,  after  which  the  plastic  ware  was  removed,  rinsed 
thoroughly  with  distilled  water,  and  sterilized  for  5  minutes  in  an 
autoclave  followed  by  a  drying  cycle. 

I.  END-LABELING  RNA  WITH  32P 

Two  methods  of  end-labeling  RNA  species  with  ”P  were  employed. 

In  the  first,  15’  32P]  cyti di ne-bi s-phosphat e  was  covalently  attached 
to  the  3'  (OH)  terminal  of  the  RNA.  The  second  involved  transfer  of 
the  (gamma)  phosphate  of  Cgamma  3aP]  ATP  to  a  dephosphoryl ated  5r 
terminal  nucleotide.  End-labeling  the  5’  terminus  of  nucleic  acids 
required  a  dephosphorylation  step,  since  forward  reaction  in  the  case 
of  phosphate  transfer  on  the  5*  terminal  is  strongly  favored  by 
dephosphorylation  of  the  5*  terminal  nucleotide.  Although  labeling  of 
the  3*  terminal  base  (using  RNA  ligase)  generally  requires  a 
dephosphorylation  step,  in  order  to  avoid  formation  of  concatemers 
(chains  of  oligomers  attached  5'  to  3").  5S  rRNAs  are  folded  in  such  a 
way  as  to  produce  a  3’  overhang,  which  obscures  the  5'  terminal  base. 
Therefore,  dephosphorylation  of  the  5'  terminus  prior  to  3' 
end-labeling  was  omitted.  Since  the  unequivocal  determination  of  the 
complete  sequence  requires  construction  of  sequence  ladders  from  both 
ends,  separate  aliquots  of  the  5S  rRNA  species  were  labeled  on  both 
termini,  in  turn. 

J.  DEPHOSPHOR YLAT I ON  OF  THE  3’  TERMINUS 

High  yields  of  end-labeled  RNA  by  the  forward  phosphorylation  of 
5’  hydroxyls  required  deph09phor y 1  at l on  of  the  native  5'  terminus 
(Richardson.  1^65;  Lillehaug  et  a/.,  1^76).  Calf  intestinal  (alkaline) 
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phosphatase  (CIP)  was  found  to  be  superior  to  bacterial  alkaline 
phosphatase  (PAP)  for  the  dephosphor yl at  1  on  of  bacterial  RNAs. 
Furthermore,  highly  purified  (molecular  biology  grade)  CIP  is 
commercially  available  (Boehringer,  Indianapolis,  IN).  Approximately 
25  pmol  (1  ug)  of  5S  rRNA  is  sufficient  for  a  complete  sequence 
analysis.  With  250  uCi  of  fresh  (i.e.,  less  than  a  week  old) 
Cgamma-32p]  ATP,  end-label  yields  on  the  order  of  10  million  cpm/ug  of 
RNA  were  commonplace. 

To  a  silanized  autoclave-sterilized  microfuge  tube  was  added  1  ug 
5S  rRNA  in  10  ul  sterile  water;  0.1  U  of  CIP,  1  mM  MgCl2,  90  ul  CIP 
buffer,  50  mM  Tris-HCl,  pH  9.0,  1  mM  spermidine.  This  was  incubated 
for  30  minutes  at  37  C.  The  reaction  was  terminated  by  the  addition  of 
100  ul  of  phenol /m-cresol  solution  (described  above),  and  briefly 
vortexed.  Organic  and  aqueous  phases  were  separated  by  centrifugation 
in  a  microfuge  for  5  minutes  at  4  C.  The  aqueous  (upper)  phase  was 
collected,  using  a  sterile  mi cropi pette,  placed  in  a  sterile  microfuge 
tube  and  chilled  in  an  ice  bath.  To  the  organic  phase  was  added  100  ul 
of  TBE.  This  was  vortexed  briefly,  and  phases  were  separated  by 
centrifugation,  as  described  above.  The  aqueous  phase  was  collected 
and  pooled.  100  ul  of  phenol /m-cresol  was  added,  vortexed  briefly,  and 
centrifuged,  as  described  above.  The  aqueous  phase  was  collected  and 
to  it  was  added  500  ul  diethyl  ether  (stored  over  water).  The  mixture 
was  vortexed  briefly,  centrifuged  for  10  seconds  and  the  upper  (ether) 
phase  was  discarded.  Ether  extraction  was  repeated  Dnce.  The 
et  her -ex  tr  ac  ted  RNA  solution  was  degassed  on  a  vacuum  line  for  several 
minutes  to  remove  traces  of  dissolved  ether,  and  placed  in  an  ice  bath. 
50  ul  of  2  M  sodium  acetate  (pH  5.5)  was  added,  vortexed  briefly,  and 
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returned  to  the  ice  bath.  The  RNA  was  precipitated  with  400  ul  cold 
<-90  C)  absolute  ethanol,  mixed  several  times  by  inversion,  and  chilled 
on  dry  ice  tor  10  minutes.  The  RNA  was  collected  by  centrifugation  for 
3  minutes  in  a  microfuge.  Ethanol  was  gently  removed  using  a  drawn 
capillary,  taking  care  to  avoid  disturbing  the  RNA  pellet  (usually  not 
visible).  400  ul  of  cold  absolute  ethanol  was  carefully  layered  over 
the  RNA,  chilled  on  dry  ice  for  2  minutes,  and  centrifuged  for  1  minute 
in  a  microfuge.  The  ethanol  was  discarded,  as  described  above.  The 
dephosphor yl ated  PNA  was  placed  in  a  vacuum  desiccator  for  15  minutes 
to  remove  all  remaining  traces  of  ethanol. 


K  -  LABELING  THE  5'  TERMINUS 


The  method  employed  to  end-label  the  3'  terminal  base  of  3S  rPMA 
was  a  modification  of  the  method  described  by  D'Alessio  The 

dephosphorvl ated  35  rRNA  was  resuspended  in  10  ul  of  water.  To  this  was 
added  3  ul  of  10  uM  ATP,  and  2  ul  of  10X  kinase  buffer  of  the  following 
composition:  0.3  M  Tris-HCl,  pH  9.0:  10  mtt  hgCl2;  and  10  mfl  spermidine. 

It  was  then  placed  in  an  ice  bath.  230  uCi  of  [gamma-33P]  ATP 
(Amersham,  Springfield,  IL),  packed  in  a  1:1  ratio  of  ethanol  and  water, 
was  evaporated  to  dryness  under  a  nitrogen  jet,  after  which  the  reaction 
mixture  was  added  and  vertexes!  briefly.  Five  U  of  T*  polynucleotide 
kinase  <PNK>  was  added.  The  mixture  was  vortexed  briefly  and  incubated 
for  13  minutes  at  7?  C.  After  incubation,  the  reaction  was  terminated 
by  addition  of  70  ul  of  ammonium  acetate  and  23  ug  of  phenol -ex tracted 
tPNA  (carrier? .  The  S2P-PNA  was  precipitated  by  the  addition  of  200  ul 
of  cold  '-90  C)  absolute  ethanol,  and  mixed  bv  inversion.  The  mixture 
was  chilled  on  drv  ice  for  10  minutes  and  collected  by  centrifugation  in 
3  microfuge  for  3  minutes.  The  alcoholic  (radioactive)  supernatant  was 
collected  with  3  drawn  capillary  and  discarded  as  radioactive  waste. 

The  3*P-PNA  was  resuspended  in  100  ul  of  cold  0. 3  M  sodium  acetate  by 
swirling,  precipitated  in  700  ul  of  cold  ethanol,  and  collected  as 
described  above,  and  overlayed  with  300  ul  of  cold  ethanol,  chilled  on 
dry  ice,  centrifuged,  and  collected  as  described  above.  The  3aP-RNA  was 
placed  in  a  vacuum  desiccator  for  1C  to  15  minutes  to  remove  remaining 


tr  aces  of  al cohol . 
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LABELING  THE  3'  TERMINUS 


To  3ppro>: ;  mat  el  y  1  ug  of  RNA,  resuspended  in  10  ul  of  water,  was 
added  2  ul  dimethyl  sulphoxide  (DMSO):  2  ul  of  0.2  M  ATP:  and  2  ul  of  10 
X  ligase  buffer  of  the  following  composition:  0.5  M  HEPES,  pH  7.5:  0.1 
h  ttgCl =:  and  33  mM  di thiothrei tol  ( DTT ) .  250  uCi  of  [32P]  pCp  <ICN 
Padi ochemi cal s,  Pale  Alto,  CA> .  packed  in  water,  was  evaporated  to 
dryness  under  a  nitrogen  jet,  after  which  the  reaction  mixture  was  added 
and  vortsxed  briefly.  The  reaction  mixture  was  placed  in  an  ice  bath, 
and  10  U  of  RNA  ligase  was  added.  The  mixture  was  vortexed  briefly,  and 
incubated  for  4  hours  at  4  C.  Termination  of  the  reaction, 
precipitation,  and  collection  of  32P-RNA  was  exactly  as  described  above 
for  5’  end-labeling. 

M.  ELECTROPHORETIC  PURIFICATION  OF  ”p-rna 

The  end-labeled  RNA  was  resuspended  in  a  sterile  siliconized 
microfuge  tube  with  20  ul  of  tracking  dye/buffer  (see  Isolation  and 
Purification  section)  and  placed  in  an  ice  bath.  A  57.  pal  yacryl  ami  de/8 
M  urea  denaturing  gel,  40  cm  in  length,  was  cast  and  pre-el ectrophoresed 
for  30  minutes.  The  32P-RNA/tracki ng  dye  mixture  was  heated  briefly  to 
°0  C,  chilled,  loaded  on  the  gel,  and  el ectrophoresed  until  the  xylene 
cyanol  'upper)  band  travelled  3/4  of  the  length  of  the  gel.  The  glass 
plates  were  removed,  and  the  gel  covered  with  plastic  film  and 
autor adi ogr aphed  using  X-ray  film  (XAft-5,  Kodak  Corp.,  Rochester,  NY). 

The  image  of  the  55  rRNA  band  was  identified  (Figure  1)  and  cut  from  the 
film  using  a  razor  blade,  and  the  film  used  as  a  template  to  locate  the 
position  of  the  35  rPNA  in  the  gel.  After  positioning  the  film 
template,  the  5S  rPNA  band  was  excised  and  placed  in  a  sterile 


Figure  1.  Relativ*  location*  of  RNA  band*  in  a  3*/.  acrylamide 
denaturing  gel.  Bands  were  generated  by  the  fractionation  of  ”p- 
on  a  57.  acrylamide  gel  containing  9  M  urea.  Locations  of  xylene 
cyanol  FF  < c >  and  brom-phenol  blue  (bpb)  are  indicated  bv  arrows. 


siliconized  1  ml  (blue)  Eppendorf  pipette  tip  which  had  been  heat  sealed 
and  plugged  with  a  small  quantity  of  sterile  siliconized  glass  wool. 

N.  ELUTION  OF  3=P-RNA  FROM  b i 0-ACRYLAM I DE  GELS 

Elution  of  32P-RNA  from  conventional  bi s-acry 1  ami de  gels  was 
accomplished  as  follows.  The  gel  slice  was  crushed  to  a  paste  against 
the  walls  of  the  pipette  tip  using  a  sterile  siliconized  glass  rod.  The 
glass  rod  was  rinsed  into  the  gel  paste  with  TOO  ul  of  0.5  fl  ammonium 
acetate,  1  mM  EDTA  in  100  ul  aliquots.  Phenol -ex tr acted  carrier  tFNA 
was  added  to  a  ratio  of  approximately  50: i  (in  the  majority  of  cases, 
about  50  ul) .  The  top  of  the  pipette  tip  was  sealed  with  parafilm  and 
incubated  overnight  at  Z1  C.  The  32P-RNA  was  recovered  by  carefully 
cutting  the  heat-sealed  tip  with  a  sterile  blade,  then  removing  the 
parafilm  from  the  top  of  the  pipette  tip,  and  allowing  the  RNA  solution 
to  collect  in  a  sterile  1.5  ml  microfuge  tube.  The  pipette  tip  was 
rinsed  with  an  additional  ICO  ul  of  ammonium  acetate,  precipitated  with 
2  volumes  of  cold  (-20  C)  absolute  ethanol,  mixed  several  times  by 
inversion,  and  chilled  on  dry  ice,  for  10  minutes.  The  32P-RNA  was 
collected  by  centr i f ugat l on  in  a  microfuge  for  5  minutes  at  4  C,  and 
stored  in  sterile  EDTA  (0.01  M,  pH  6.5)  at  -20  C. 

0.  TERMINAL  ANALYSIS 

1.  IDENTIFICATION  OF  THE  5*  TERMINAL  NUCLEOTIDE 

*,000  to  20,000  cpm  of  5'  end-labeled  RNA  was  exhaustively  digested 
with  nuclease  PI  and  chr omatogr aphed  on  a  PEI-cellulose  thin  layer  plate 
'with  fluorescent  indicator).  Standards  composed  of  the  4  common 


nucleotide  base  monophosph at es  were  spotted  on  the  TLC  plate  and 


chromatographed  alongside  the  labeled  digest,  using  0.23  M  Li  Cl  as  the 
mobile  phase  (Panderath  and  Panderath,  1967).  After  completion  of  the 
chr oma t ogr aphy ,  the  reference  monophosphates  were  located  with  a  short 
wavelength  black  light,  and  their  positions  marked  on  the  TLC  plate. 
Next,  the  TLC  plate  was  autoradi  ogr  aphed  overnight  in  order  to  determine 
location  of  the  unique  radioactive  terminal  base.  F#  values  of 
nucleotide  monophosphates  on  PEI -cel  1 ul ose ,  developed  with  0.25  M  LiCl 
were  taken  from  Panderath  and  Panderath  <1967). 

2.  IDENTIFICATION  OF  THE  3'  TEPMINAL  NUCLEOTIDE 

Analysis  of  the  3’  terminal  base  follows  the  same  rationale  as  5' 
terminal  analysis,  except  that  Phase  T2  was  substituted  for  nuclease  PI 
in  the  exhaustive  digestion  of  the  33tP-RNA. 

p.  ENZYMATIC  SEQUENCE  ANALYSIS  OF  RNA 

Sequences  of  PNAs  not  containing  modified  bases  (bacterial  5S 
rPNAs )  were  determined  using  enzymatic  methods  and  chromat ogr aphy  on 
ultrathin  pol yac ry 1  ami de  gels  using  an  enzymatic  approach  (Doni s-Kel 1 er , 
1980:  riacDonell  and  Colwell,  l?B4d>.  A  number  of  specific 
endoribonucleases  have  been  char acteri zed  and  are  commercially 
available.  These  include  Tl,  U2,  Phy  M,  B.c.,  CL3,  and  hi  (PL 
B i oc h em i c a  1 s ,  Milwaukee,  WI).  Endoribonucleases  Tl  (Sato  and  Egami, 

1957)  and  U2  (Uchida  et  a/.,  1974),  exhibit  a  high  degree  of 

specificity.  Phases  Tl  and  U2  hydrolyze  the  phosphate  backbone, 
producing  5' -phosphates  at  guanines  and  adenines  respectively.  Enzymes 
highly  specific  for  cytidine  and  uridine  have  yet  to  be  char  acteri zed. 
Therefore,  it  is  necessary  to  employ  several  Phases  in  concert  in  order 


sequence  variations,  however,  were  observed  in  virtually  every  region  of 
the  molecule.  Taken  as  classes,  several  sequence  variations  in  the 
"hypervar  1  abl  e"  B/B'  helix  were  -found  to  be  character  1 st i c  of  3S  rRNAs 
from  clusters  of  similar  strains.  The  existence  of  these  characteristic 
sequences,  known  as  Mgr oup-speci f i c  signatures"  appears  to  be  common  to 
all  ribosomal  RNAs  (Delihas  and  Andersen,  19B2;  Kuntzel  et  a! , ,  1983). 
Based  on  the  possession  of  common  group  specific  signatures,  the  5S  rRNA 
sequences  determined  in  this  study  could  be  grouped  into  five  sets. 

These  groups,  as  well  as  the  char acter i st 1 c  signature  sequences,  are 
listed  in  figure  4. 

B.  CLUSTER  ANALYSIS 

Comparison  of  the  5S  rRNA  sequences  was  accomplished  by  cluster 
analysis,  using  both  unweighted  pair-group  (UPG),  and  weighted 
pair-group  mathematical  average  (WPGMA)  methods-  In  the  case  of  cluster 
analysis  by  UPG,  three  different  dendrograms  were  generated  from  the  5S 
rRNA  sequence  data.  These  correspond  to  single  linkage,  average 
linkage,  and  complete  linkage  clustering  (figure  5a~c) .  In  the  case  of 
cluster  analysis  by  single  linkage,  strains  (and  clusters  of  strains) 
were  linked  at  the  level  of  the  highest  degree  of  relatedness  of  any  two 
of  their  component  sequences.  In  the  case  of  complete  linkage,  strains 
(and  clusters)  were  linked  at  the  highest  level  of  similarity  in  the 
sequence  of  one  cluster  compared  with  every  sequence  for  strains  of 
another  cluster.  Average  linkage  portrays  the  mathematical  average 
similarity  (5-value)  between  clusters.  For  WPGMA  cluster  analysis,  the 
algorithm  of  Kimura  (i960),  which  estimates  evolutionary  distances  from 
pairwise  comparisons  of  nucleic  acid  sequences,  was  used.  This 


Figure  3.  Alignment  of  3S  rRNA  sequences.  Nucleotide  base  sequences 
of  five  5S  rRNAs  are  depicted  in  the  alignment  scheme  of  Erdmann  et 
al .  (1983) •  Boxed  areas  indicate  regions  assumed  to  participate  in 
helices.  Hypothetical  loops  and  helices  are  indicated  using  the 
lettering  scheme  of  Hori  and  Osawa  (1979),  wherein  helix  B/B*,  for 
example,  would  derive  from  the  base-pairing  of  regions  B  and  B'. 

Loops  (L)  are  flanked  by  the  adjacent  helix  designations,  indicated  in 
lower  case. 

Ref erences: 

*MacDonell  and  Colwell  <1984e> 

2MacDonell  and  Colwell  (1984a) 

’Luehrsen  and  Fox  (1981) 

*Woese  et  al .  (1975) 
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2b.  Sequence  ladders  generated  from  limited  enzymatic 
digest  of  5S  rRNAs  purified  f rom  V,  vul ni f 1 cus  (left)  and  A.  hydrophila 
(right).  Identities  of  the  endoribcnud eases  are  listed  at  the  tops  of 
sequence  lanes.  A  portion  of  the  nucleotide  base  sequence  of  V. 
yulnificus  is  listed  to  the  left  of  the  enzyte  11  lane. 


Figure  2.  Sequence  ladder*  of  3S  rRNAs  generated  using  an  enzymatic 
approach. 

2a.  Sequence  ladders  generated  from  the  limited  enzymatic 
digest  of  5S  rRNA  purified  from  fil terotonas  putri faci ens.  Lanes  are 
identified  as  to  the  substrate  nucleotide  (see  text  for  explanation). 
Identities  of  nucleotide  bases  are  listed  to  the  left  of  the  HG*’  lane. 


RESULTS 


A.  3S  rRNA  SEQUENCES 

The  nucleotide  base  sequences  of  5 S  rRNAs  from  26  bacterial  strains 
were  determined  by  the  enzymatic  method  (Doni s-Kel 1 er,  1900;  flacDonel  1 
and  Colwell,  1904d).  Sequences  were  deduced  from  "sequence  ladders" 
generated  by  el ectr ophoresi s  of  limited  digests  of  uniquely  end-labeled 
5S  rRNAs  on  ultrathin  polyacrylamide  sequencing  gels  (Sanger  and 
Coulsen,  1978)  and  imaged  by  autorad i ogr aphy .  A  typical  autoradi ogr am, 
annotated  to  indicate  the  character  Df  digest  lanes  and  identity  of  base 
sequence,  is  shown  in  figure  2.  The  generation  of  several  (typically  3 
tD  10)  such  sets  of  sequence  ladders  was  necessary  in  order  to  deduce 
each  composite,  i.e.,  complete,  nucleotide  sequence.  The  sequences  of 
5S  rRNAs  from  strains  representing  25  named,  or  suspected  Vibrio  species 
(as  well  as  £,  coli ,  included  in  the  study  as  a  control )  are  listed  in 
Table  2.  Sequence  alignments  are  based  on  the  recommendation  of  Erdmann 
et  al.  (1983).  Helical  regions  (De  Wachter  et  al • ,  1982;  Erdmann  et 
al . ,  1984;  MacDonell  and  Colwell,  1984d)  are  indicated  as  boxed-in 

are  as,  and  loops  and  helices  are  designated  by  the  lettering  scheme  of 
Hori  and  Osawa  (1979).  See  figure  3.  Although  a  length  of  120 
nucleotide  bases  was  typical,  a  range  of  variation  in  length,  from  119 
to  122  nucleotides,  was  observed.  With  only  a  single  exception  in  26 
sequences,  i.e the  sequence  of  5S  rRNA  from  V.  sarmus,  the  site  of 
length  variation  was  restricted  to  the  base  40  -  44  region  of  loop  cLc*. 
Sequence  variation  was  found  to  be  restricted,  to  a  large  degree,  to  two 
regions,  referred  to  as  "hyper var l ab 1 e "  by  Fox  et  al.,  (1977).  These 


regions  correspond  to  helix  B/B’  and  loop  cLc'  (Figure  3). 


M i  nor 


T.  NUCLEASE  SI  LIMITED  DIGESTS 

End-labeled  55  rfiNAs  from  some  species  were  subjected  to  limited, 
"si  ngl  e-hi  t M  digests  using  nuclease  SI  (PL  Bi  ochemi  cal  s  f 
Milwaukee,  WI).  Sequence  ladders  -from  these  digests,  and  from 
conventional  sequencing  digests,  were  generated  simultaneously  in  order 
to  determine  the  locations  of  single-stranded  and  helical 
(double-stranded)  regions  in  the  5S  rFTIA  molecule.  SI  digests  were 
prepared  as  described  by  Maniatis  et  3! .  (1982),  except  that  the 
concentration  of  Zn'**’  ions  was  reduced  to  57.  of  that  recommended  for 
hydrolysis  of  DNA.  This  was  done  to  reduce  exposure  of  RNA  tc  Zn^***  10ns 
which  have  been  shown  to  efficiently  introduce  lesions  in  the  phosphate 
backbones  of  RNAs  (Butzow  and  Eichhorn,  1975). 

U.  DATA  MANAGEMENT 

Storage  of  sequence  data,  sequence  comparisons,  free  energy 
determinations,  and  generation  of  dot  matrices,  evolutionary  trees  and 
dendrograms,  as  well  as  generation  of  graphics,  was  done  using  either  of 
2  computer  systems.  These  were  (1)  TRS-80  model  100  computer,  outfitted 
with  64  kilobytes  of  random  access  memory  (RAM)  (Tandy  Corp. ,  Ft,  Worth, 
TX)  or  (2)  Commodore  VIC-20  (Commodore  Business  Machines,  West  Chester, 
PA),  outfitted  with  32  kilobytes  of  RAM,  and  an  80-column  board 
(Rrotecto  Enterprises,  Barrington,  IL).  All  computer  programs  employed 


in  this  study  are  listed  in  appendix  A. 


xylene  cyanol  :.e..  upper,  band.  The  wrapped  gels  were  placed  on  a 
sheet  of  cardboard  and*  in  a  darkroom,  the  radioactive  bands  were  imaged 
by  exposing  the  gel  to  a  sheet  of  X-ray  film  for  approximately  10 
seconds,  the  exact  exposure  time  being  determined  empirically.  Using  a 
sterile  knife  blade,  the  photographic  image  of  the  5S  rRNA  band  was 
carefully  cut  from  the  film.  The  film  was  aligned,  and  used  as  a 
template  for  excising  the  32p-55  rF;NA  from  the  gel. 

2.  AUTORADIOGRAPHY  OF  SEQUENCE  LADDERS 

In  the  case  of  the  imaging  of  sequence  ladders  produced  by  the 
limited  enzymatic  digests,  very  small  quantities  of  radiation  often  were 
involved.  Whether  or  not  to  use  intensification  screens  for  the 
autoradi ogr aphy  was  dictated  by  the  yield  of  labeled  termini. 

Autor adi ography  of  sequence  ladders  required  at  least  50,000  cpm  of 
3aP-RNA  per  lane,  but  this  could  be  reduced  to  10,000  by  using 
intensification  screens.  In  general,  however,  it  was  possible  to  read 
screened  aut orad i ogr ams  only  to  about  2/3  the  distance  as  those  exposed 
without  screens. 

Thin  sequencing  gels  were  removed  from  the  gel  form  and  wrapped  in 
plastic  wrap.  In  the  darkroom,  the  gels  were  covered  with  X-ray  film, 
placed  in  a  film  casette,  and  allowed  to  expose  overnight.  When 
intensification  screens  were  used,  the  cassette  was  placed  in  a  -70  C 
freezer.  Otherwise  they  were  placed  in  a  -20  C  freezer. 


REACTION  BUFFERS 


PNases  Tlt  Phy  fl.  and  Ml:  0-025  mfl  sodium  citrate,  pH  5.0;  7  M 
urea;  1  mfl  EDTA;  and  0.057.  (w/v)  each  of  xylene  cyanol  FF  and 
brom-phenol  blue. 

RNase  U2;  0.025  mfl  sodium  citrate,  pH  3.5;  7  fl  urea;  1  mrt  EDTA; 

and  0.057.  (w/v)  each  of  xylene  cyanol  FF  and  brom-phenol  blue. 

PNase  B-  c .  :  0.025  mM  sodium  citrate,  pH  5.0. 

S.  AUTORADIOGRAPHY  OF  «P-RNA 

The  generation  of  autoradi ogr ams  was  necessary  at  two  stages.  The 
first  was  in  locating  the  32P-RNA  after  purification  by  gel 
el  ectrophoresi s.  The  second  applied  to  the  imaging  of  end-labeled 
fragments  resulting  from  limited  enzymatic  digests,  i.r,,  the  ‘sequence 
ladder".  Kodak  XAR-5  film  (Kodak  Corp.,  Rochester,  NY)  was  used  for 
autoradi ogr aphy.  Films  were  developed  with  either  D-il  or  with  Kodak 
GBX  developer. 

1.  LOCATING  THE  PURIFIED  32P-RNA  BAND 

After  the  RNA  had  been  end-labeled  and  subsequently  purified  by  gel 
el ec trophoresi s,  the  PNA  band  was  located  and  excised  so  that  the 
purified  32P-PNA  could  be  recovered.  Relatively  large  amounts  (>10T 
dpm)  of  radiation  were  involved;  therefore,  typical  exposure  times 
ranged  from  10  to  30  seconds.  After  completion  of  purification  of 
3*P-RNA  el ectr ophoresi 5,  the  gel  was  removed  from  the  gel  form  and 
covered  with  plastic  wrap.  When  necessary,  brittle  denaturing  gels  were 
trimmed  with  a  pizza  cutter  in  order  to  produce  smooth  edges.  In  57. 
denaturing  acrylamide  gels,  55  rRNA  bands  were  located  just  below  the 


107.  (w/v),  depending  on  the  oligomer  sire  range  under  investigation.  A 
lane  containing  a  limited  alkaline  hydrolysis  of  an  aliquot  of  labeled 


RNA  nas  run  alongside  the  sequence  lanes  in  order  to  mark  the  position 
of  each  length  of  n-mer,  from  ncl  to  n*l22,  in  the  autor adi ogram  of  the 
sequencing  gel . 

Sequencing  gels  were  cast  and  run  on  a  40  cm  model  SO  sequencing 
system  (BRL,  Gaithersburg,  MD) ,  to  which  power  was  supplied  by  a  Bio  Rad 
(Richmond.  CA)  model  3000  high  voltage  transformer.  Simultaneous 
el ectrophoresi s  of  the  limited  RNA  digests  in  adjacent  lanes  of  a  thin 
denaturing  polyacrylamide  gel  consistently  resulted  in  a  unique  set  of 
wel 1 -resol ved  bands  from  which  the  RNA  sequence  was  read  directly. 

Adjustment  of  the  enzyme-to-substr ate  ratio  so  as  to  achieve 
"single-hi t"  conditions  was  approached  two  ways:  (1)  serial  10-fold 
dilutions  of  the  endoribonucl eases  to  achieve  the  proper  titre;  and  (2) 
adjustment  of  substrate  concentr at i on  by  addition  of  carrier  tRNA. 
Adjustment  of  substrate  concentration  was  generally  more  useful  in 
controlling  enzyme/substrate  ratios. 

R.  BUFFERS 

1.  ENZYME  DILUTION  BUFFERS: 

PNases  T1  and  Phy  M :  25  mM  sodium  citrate,  pH  5.0;  7  M  urea;  1  mM 

EDTA;  0.05X  (w/v)  xylene  cyanol  FF;  and  0.057.  (w/v)  brom-phenol  blue. 

RNase  U2:  25  mM  sodium  citrate,  pH  3.5;  1  mM  EDTA;  0.05X  (w/v) 

xylene  cyanol  FF ;  and  0.057.  (w/v)  brom-phenol  blue. 

RNases  B.c.  and  Ml:  25  mM  sodium  citrate,  pH  5.0. 


to  unequivocally  identify  the  pyrimidine  bases. 


Three  Df  these:  Phy  M, 


specific  for  adenine  and  undine  ( Doni s-Kel 1 er ,  1980);  B.c.,  specific 
for  cytidine  and  uridine  (Lockard  et  al .  ,  1978);  and  Ml,  which 
hydrolyzes  at  adenine,  guanine  and  uridine,  but  not  cytidine  residues 
<PL  Biochemicals,  unpublished  data),  were  of  particular  value. 

Endor i bonucl eases  useful  for  sequence  analysis,  except  for  RNase  Ml, 
hydrolyze  the  phosphate  backbone,  producing  5'  phosphates  at  the  site  of 
specificity-  RNase  Ml,  however,  hydrolyzes  the  phosphate  backbone  to 
produce  a  5r  nucleotide  phosphate.  This  results  in  the  shifting  of  the 
RNase  Ml  lane  one  position  out  of  frame,  and  must  be  taken  into 
consideration  in  analyzing  the  subsequent  sequence  ladders.  RNase  CL3 
(Bogusky  et  al , .  1980;  Levy  and  Karpetsky,  19B0),  with  a  putative 
specificity  for  cytidine  residues  was  not  employed  because  of  inadequate 
speci f l ci ty- 

Q.  LIMITED  ENZYMATIC  DIGESTS 

"Sequence  ladders"  from  which  the  nucleotide  base  sequence  of  RNA 
molecules  was  read  directly,  were  generated  by  adjusting  the  ratio  of 
enzyme  ( endor i bonucl ease)  to  substrate  <RNA)  so  that  each  end-labeled 
RNA  molecule  was  clipped,  on  average,  at  exactly  one  site.  This 
resulted  in  a  nested  set  of  fragments  in  which  every  possible 
end-labeled  sequence  was  represented.  Limited  digests  of  separate 
aliquots  of  3*P-RNA,  using  the  various  endor i bonucl eases,  were  used  to 
generate  sets  of  fragments  which  were  terminated  by  the  enzyme-specific 
base  on  one  end  and  the  ”P  label  on  the  other.  These  fragments  were 
separated  on  thin  8  M  urea  pol yacryl ami de  denaturing  gels  (Sanger  and 
Coulsen,  1970)  which  ranged  in  total  acrylamide  concentration  from  77.  to 


Figur®  4.  Sroup-Sptcif ic  Signatures  in  tht  V ibriontctst.  The 

hypervar i abl e  B/B’  helix  was  found  to  constitute  the  smallest  sequence 
segment  by  which  the  major  clusters  of  strains  could  be  distinguished. 
The  five  consensus  sequences,  therefore,  are  considered  to  represent 
group-specif ic  signatures  (see  Delihas  and  Andersen,  1982). 
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Figure  3.  Unweighted  pair  group  (UPG)  analytic  dendrogram# 

3a.  The  UPG  single  linkage  dendrogram  (i.e., 
evolutionary  tree)  resulting  from  the  clustering  of  species  on  a  basis 
of  overall  5S  rRNA  sequence  homology.  Numbers  denote  difference 
matrix  elements,  indicating  the  total  number  of  base  differences  out 
of  120  (the  nucleotide  base  length  of  5S  rRNA). 


5b.  The  UPGMA,  or  UPS  average  linkage,  dendrogram 
resulting  from  the  clustering  of  species  on  a  basis  of  overall  5S  rRNA 
sequence  homology. 


5S  St<Mnct  Cmndiia  UVW 


3c.  The  UPG  complete  linkage  dendrogram  resulting  from 
the  clustering  of  species  on  a  basis  of  overall  55  rRNA  sequence 
homology. 


algorithm,  in  effect,  weights  tr ansversi ons  against  transitions  in 
nucleotide  base  mutations  in  such  a  way  that  transversi ons  are 
penalized.  A  mutation  from  purine  to  pyrimidine,  or  from  pyrimidine  to 
purine  constitutes  a  transversi on,  and  a  mutation  from  one  pyrimidine  to 
another,  or  from  one  purine  to  another  constitutes  a  transition. 
Deletions  and  insertions  are  assessed  as  transversi ons.  The  algorithm 
assigns  a  coefficient  of  evolutionary  "distance",  K(nuc),  to  each 
pairwise  comparison,  and  is  not  dependent  on  there  being  an  identical, 

i. r.,  common  ancestral  sequence. 

Computer  programs,  written  in  BASIC  language,  were  developed  to 
compute  S-values  for  sequence  comparisons,  as  well  as  K(nuc)  values  (see 
Appendix  A).  The  UPG  difference  matrix,  and  the  evolutionary  distance, 

j. e.,  K<nuc),  matrix  are  shown  in  figure  6.  Results  of  cluster 
analyses,  were  used  to  construct  dendrograms  and  to  detect  evolutionary 
rel ati onships. 

1.  UPG  Analysis 

Dendrograms  generated  from  UPG  cluster  analyses  using  single-, 
average-,  and  complete  linkage  groups  are  shown  in  figure  5a-c, 
respect i vel y .  Dendrograms  generated  from  single  linkage  clustering 
(Figure  5a)  were  helpful  only  in  indicating  maximum  relatedness.  For 
example,  it  is  apparent  that  Vibrio  5S  rRNA  sequences  are  significantly 
homologous,  with  very  few  differences,  thus,  the  majority  of  the  species 
examined  appear  to  belong  to  a  single  large  cluster.  Both  average-  and 
complete  linkage  clustering,  on  the  other  hand,  provided  additional 
information  concerning  r el  at l onshi ps  among  55  rRNAs  of  individual 
species.  In  both  cases,  two  distinct  similarity  groups  were  observed. 


Figure  6.  UPG  and  K(nuc)  difference  matrices. 


6a.  UPG  difference  matrix.  Numbers  indicate  the  base 
differences  (of  120  total)  between  the  55  rRNAs  from  pairs  of  species. 
The  difference  matrix  element  for  each  pair  of  species  is  located  at 
the  intersection  of  that  pair.  Asterisks  indicate  identities. 
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KEY:  PH  *  P,  pbospbortoM,  LE  *  P,  ltioqmtbi  (P.  a*girtu)f  LQ  *  P.  iojei,  FI  *  0.  fircftfri, 

PA  *  Y,  Finhieiolftias ,  NA  *  0,  ditriegens,  AL  *  V.  lijinoiftico/,  FI  =  V.  PJiriaJjr, 

MI  *  9.  fificu,  HE  s  9,  Mtscboikovii,  GA  *  I.  Cl  «  V.  ’Cincinnati’, 

CA  *  7,  carcAifiif,  DI  =  V,  diuotrophias,  HA  =  9.  hirvtfi^  CH  =  9.  cAoIerae,  ^ 

VU  *  9,  Ktti»i7icai,  DA  *  V,  dnstli,  AH  *  V.  iDfaiJiarsi,  HA  *  V.  jari&or,  | 

UM  *  strains  UH  40  and  N  145f  PU  *  A.  patrificievs ,  PS  *  P.  r6i9f/loirffr  1 


i 


6b.  K(nuc),  or  evolutionary  distance,  difference  matrix 
Values  at  the  intersections  of  pairs  of  species  indicate  the 
"evol uti onar y  distance"  between  those  species  (Kimura,  19B0).  K(nuc) 
values  are  listed  as  factors  of  10“*  to  facilitate  data  handling.  The 
K(nuc)  value  correspondi ng  to  the  evolutionary  distance  between  P. 

1 e iognathi  and  P.  pho s  phoreuw*  therefore,  should  be  read  as  .0031. 


PH 

L£  1C 

FI 

PA 

NA 

AL 

FL 

HI 

HE 

6A 

Cl 

CA 

DI 

HA 

CH 

VU 

DA 

AN 

HA 

m 

PU 

PS 

HD 

HY 

PH  1 

31  128 

129 

364 

305 

270 

436 

411 

351 

270 

316 

108 

353 

340 

343 

207 

308 

329 

428 

4B4 

421 

519 

491 

557 

L£ 

1  95 

96 

320 

270 

236 

400 

375 

316 

236 

282 

376 

31B 

305 

308 

252 

273 

294 

392 

447 

385 

482 

455 

520 

LO 

1 

128 

227 

236 

138 

297 

273 

216 

204 

250 

273 

216 

270 

273 

284 

239 

196 

428 

555 

417 

590 

487 

552 

FI 

1 

270 

214 

181 

340 

316 

299 

293 

340 

318 

305 

293 

250 

195 

263 

204 

454 

458 

324 

548 

391 

455 

PA 

1 

52 

84 

64 

86 

96 

129 

174 

85 

140 

192 

150 

193 

207 

162 

415 

639 

523 

615 

588 

654 

NA 

I 

94 

117 

95 

106 

138 

183 

95 

149 

137 

95 

138 

296 

216 

403 

573 

461 

604 

523 

580 

AL 

I 

15 

128 

74 

106 

150 

128 

116 

170 

192 

203 

227 

163 

510 

612 

472 

643 

486 

550 

FI 

1 

21 

74 

151 

106 

151 

207 

261 

128 

171 

311 

184 

491 

722 

604 

700 

671 

740 

HI 

1 

53 

129 

85 

129 

1B4 

237 

106 

149 

287 

207 

517 

694 

5/7 

730 

643 

711 

ft 

I 

74 

74 

139 

129 

248 

160 

204 

229 

151 

528 

706 

361 

740 

577 

643 

6A 

1 

42 

172 

117 

282 

239 

284 

218 

139 

491 

612 

548 

700 

615 

682 

Cl 

1 

218 

162 

329 

193 

238 

263 

184 

544 

666 

600 

761 

671 

740 

CA 

I 

53 

105 

193 

238 

311 

252 

514 

692 

373 

666 

5B6 

652 

Cl 

1 

159 

250 

296 

253 

195 

579 

706 

561 

734 

373 

639 

HA 

1 

237 

248 

387 

364 

525 

666 

550 

561 

561 

627 

CH 

t 

05 

403 

321 

421 

621 

480 

65B 

544 

610 

W 

i 

308 

434 

491 

560 

447 

595 

511 

577 

DA 

t 

160 

339 

555 

465 

636 

480 

544 

AN 

1 

480 

627 

486 

661 

356 

622 

HA 

1 

571 

506 

704 

735 

733 

111 

1 

371 

766 

755 

797 

PU 

1 

646 

329 

595 

PS 

1 

489 

554 

HD 

t 

53 

HY:  PH  *  f.  pHojpfJorea*,  LE  *  P*  leiogmthi  (P.  «9«jta»),  10  *  P.  Jojr;,  FI  *  V#  fjjctofri, 
PA  *  r.  pifi/jmoiftiCflj,  NA  «  V,  nitrjfjm,  AL  *  V,  gjno;  f ti ccxf  FL  *  V.  flifiilis , 
HI  =  V.  u i.’ca;,  HE  *  fetKfcDifom,  &A  *  Y.  pa/opm*,  Cl*  V,  'Cincinnati1 
GA  *  V.  carcftirjie,  01  *  V.  rfiaxotropbica*,  HA  *  i.  hintji,  CH  s  V,  cfcoirraf, 

W  3  V.  KftiDifica/,  DA  *  V.  tfaufia,  AN  -  V.  iEjci/iarai,  HA  *  V.  nm«/, 

J1  «  strains  LfTI  <0  and  «  145,  PU  *  A,  ;atrj/jc;mf  PS  *  P.  HD  *  A.  ffrfia 

HY  =  I,  hydrophili 


The  first  comprised  V .  p ar ahaewo l  yt i c us .  V  .  n  at  r  i  e  gen  s  ,  V. 

a/ nci  / 1 :  c  us  ,  V.  fluvialis,  V.  vet  schr>  1  kcv  1 1  ,  V»  •i»icu5,  V.  ^azoyenejr, 

V.  vul m  f icus ,  V.  c/?oierae,  V#  carchar; ae ,  </.  diazotro phicus ,  V. 

/?arveyi,  and  a  new  Vibrio  species  tentatively  designated  V,  Cincinnati;. 
The  second  cluster  comprised  the  Photobacter i u9  species:  P.  phos phoreu* , 
P.  1 eiognathi ,  P.  angustu9f  and  V ,  logei ,  and  "V."  fischeri  (Table  3). 
Although  the  generic  composition  0 f  the  clusters  was  identical, 
intra-phenet 1 c  differences  were  observed.  For  example,  the  V.  fluvialis 
-  V.  aniens  and  V.  cholerae  -  (i.  vulnificus  doublets  clustered  by 
complete  linkage,  whereas  by  average  linkage,  they  appeared  to  be  more 
distantly  related,  not  only  to  each  other,  but  also  to  the  central 
Vibrio  cluster. 

The  second  irajor  cluster,  containing  the  Photobacter  i  a9  species, 
did  not  change  with  linkage  method  employed.  Although  </.  daasela  and  V. 
anguil 1 arua  farmed  a  doublet  related  to  the  main  Vibrio  cluster  by 
average  linkage  clustering  (Figure  5b),  by  both  single-  and  complete 
linkage,  they  represented  a  separate  cluster.  P,  9edia  and  P. 
hydrophi I  a  were  only  distantly  related  to  the  other  species  of  the 
Vibr icnaceae  by  cluster  analysis,  regardless  of  linkage  method.  By 
average-  and  complete  linkage,  however,  V,  ps ychroer ythrus  clustered 
with  P.  wedia  and  P.  hydrophi 1  a,  but  at  a  relatively  low  similarity 
va 1 ue. 

2.  Evolutionary  Distance 

The  dendrogram  generated  from  the  cluster  analysis  using  the  Kimura 
algorithm  for  estimation  of  evolutionary  distance  (Figure  7),  bears  a 


striving  resemblance  to  that  generated  by  UP&MA  (average  linkage,  figure 


1 odro«ti 


6CGAUUUA&- 


-CUGACU-C-A-6CC6-ACUCA6UC 


6AUUUA&C& 


V. 

gMiogenes 

&CG 

V. 

c i ncl nnat i 

GCG 

V. 

•  191  CHS 

GCG 

if. 

11  uw j  Ml  it 

G  G 

UUU  G  CUGA  0  C  A  &CCG 
UUU  G  CUGA  U  C  A  &CC& 
UUU  6  CUGA  U  C  A  CCG 
UUU  6  CUGA  U  C  A  CCG 


ACUCA&U 

&A 

A&CG 

ACUCAGU 

GA 

AGC6 

ACUCAG 

GA 

AGCG 

ACUCAG 

GA 

AGCG 

if 

.  d i  J lot rophi c us 

GCGAUUU  6 

CUGA 

U  C  A 

GCC6 

ACUCAG 

GA 

UUAGC& 

if . 

.  csrchMriM e 

GC&AUUU  G 

CUGA 

U  C  A 

CC6 

ACUCAG 

GA 

UUAGCG 

if , 

.  9i»lCUf 

GCG  UUU  6 

CUGA 

U  C  A 

CCG 

ACUCAG 

GA 

AGCG 

if, 

,  1 1 uv  t  Ml  1 S 

G  6  UUU  G 

CUGA 

U  C  A 

CCG 

ACUCAG 

GA 

AGCG 

if  . 

d l Miotrophicuf 

GCGAUUU  G 

CUGA 

U  C  A 

GCCG 

ACUCAG 

GA 

UUAGCG 

if. 

CMrchMr i mm 

GC&AUUU  6 

CUGA 

U  C  A 

CC6 

ACUCAG 

GA 

UUAGCG 

if. 

per MhMe  »o 1  ft j cus 

G  6  UUU  G 

CU6A 

U  C  A 

CCG 

ACUCAG 

GA 

UA&C& 

if. 

f 1 UV l Ml i s 

6  G  UUU  G 

CUGA 

U  C  A 

CC6 

ACUCAG 

GA 

AGCG 

if. 

d i Miotrophicuf 

GCGAUUU  G 

CUGA 

U  r  A 

GCCG 

ACUCAG 

GA 

UUAGCG 

if. 

Ml  gmol  yt  icur 

GCG  UUU  & 

CUGA 

U  C  A 

6CC6 

ACUCAG 

GA 

UAGCS 

A. 

1  o  get 

6C6  U  U  6 

CUGA 

U  C  A 

GCCG 

ACUCAG 

& 

UAGCG 

A. 

hydrophi 1  m 

GCG  U  G 

CUGA 

U  C  A 

&CC6 

ACUCAG  % 

6 

UA&C6 

if. 

d i Miotrophicuf 

GCGAUUU  G 

CUGA 

U  C  A 

GCCG 

ACUCAG 

GA 

UUAGCG 

V. 

Ml  gmol  yt icus 

GCG  UUU  G 

CUGA 

U  C  A 

6CC6 

ACUCAG 

GA 

UA6C6 

e. 

1  ogei 

GCG  U  U  G 

CUGA 

U  C  A 

GCCG 

ACUCAG 

G 

UAGCG 

dMBfel  m 

GCG  U  U  G 

CUGA 

U  C  A 

6CC6 

ACUCAG 

6 

AGC6 

if. 

Mngut 1 1 mt  um 

G  6  U  U  G 

CUGA 

U  C  A 

GCCG 

ACUCAG 

GC6 

if. 

d  i  Miotrophicuf 

GC&AUUU  6 

CUGA 

U  C  A 

GCCG 

ACUCAG 

&A 

UUAGCG 

Q. 

Ml g i no 1  ft i c  us 

GCG  UUU  G 

CUGA 

U  C  A 

GCCG 

ACUCAG 

GA 

UAGCG 

A, 

1  ogti 

GCG  U  U  G 

CUGA 

U  C  A 

&CC6 

ACUCAG 

6 

UAGCG 

if. 

1 i  scheri 

&C  U  U  G 

CUGA 

U  C  A 

&CCG 

ACUCAG 

G 

UAGCG 

A . 

put  r  l  f  mc  i  er>  s 

GC  U  U  G 

CUGA 

C  A 

CCG 

ACUCAG 

G 

U  GCG 

if. 

di  Mid trophi Cuf 

GCGAUUU  G 

CUGA 

U  C  A 

6CC6 

ACUCAG 

GA 

UUAGCG 

if. 

mI  gmol  yt  tcur 

GCG  UUU  G 

CUGA 

U  C  A 

GCCG 

ACUCAG 

GA 

UAGCG 

if. 

vul n i  ficus 

GCG  UUU  G 

CUGA 

C  A 

CCG 

ACUCAG 

GA 

UAGCG 

A. 

put  r  j  f  §c  i  er>j 

GC  U  U  & 

CUGA 

C  A 

CCG 

ACUCAG 

6 

U  GCG 

if . 

d i Miotroph ic  us 

6C&AUUU  6 

CUGA 

U  C  A 

GCCG 

ACUCAG 

GA 

UUA6C6 

if. 

el  g  mol  fticus 

GCG  UUU  G 

CUGA 

U  C  A 

GCCG 

ACUCAG 

GA 

UAGC6 

if. 

9 lOtCuf 

GCG  UUU  G 

CUGA 

U  C  A 

CCG 

ACUCAG 

6A 

AGCG 

if. 

f 1 uv l Ml l f 

G  G  UUU  G 

CUGA 

U  C  A 

CCG 

ACUCAG 

GA 

AGCG 

{) .  i  oge  2  GCG-U-U-G - CUGA-U-C-A-GCCG-ACUCAG - G UAGCG 

tj .  aiginoi  y  1 1  c  ns  6C6-UUU-6 — - -  CUGA-U-C-A-GCCG-ACUCAG - G  A  ““UAGCG 

\j «  angui 1 1 ^rui  &— G— U— U— G— — - CUG A— U— C— A — GCCG— ACUC  AG— — — — — — — — — — — &CG 

V.  damsel  a  GC6-U-U-G - CUGA-U-C-A-GCCG-ACUCAG - G AGCG 

V,  d i azotro  phi cus  6CGAUUU-G - CUGA-U-C-A-GCCG-ACUCAG - 6A-UUAGCG 


Visual  inspection  of  these  sequences  suggested  the  possibility  that 
base  sequences  from  certain  species  of  the  Vibri onaceae*  representing 
this  palindromic  region,  might  be  arranged  in  such  a  way  that  a 
sequential  drift  in  base  sequence  could  be  detected.  The  five  species 
of  the  above  example  are  arranged  thus: 


V .  d i azotro phi cus  6C6AUUU“G — — CUGA— U“C“A“GCCG_ACUC AG“  6A-UUAGC6 

V.  algmol  yticus  6CG-UUU-G - CUGA-U-C-A-GCCG-ACUCAG - GA — UAGCG 

V ,  logei  GCG-U-U-G - CUGA-U-C-A-GCCG-ACUCAG - G - UAGCG 

V .  damsel  a  GCG-U-U-G - CUGA-U-C-A-GCCG-ACUCAG - G - AGCG 

V.  angui 1 1  arum  6-G-U-U-6 - CUGA-U-C‘~A-6CC6-ACUCAG - ----GCG 


When  the  same  approach  to  arrangement  of  sequences  of  the  putative 
54-base  palindrome  was  applied  to  all  the  Vibrionaceae  5S  rRNA 
sequences,  eight  groupings,  all  of  which  appear  to  suggest  evolutionary 
relationships  among  component  species  were  obtained  (Table  3).  The 
relationships  implied  in  the  groupings  suggest  that  a  single,  common, 
simultaneous  solution  to  the  phylogeny  of  the  eight  species  can  be 
portrayed  as  an  evolutionary  tree.  See  figure  9. 


8c.  Composite  dot  matrix  map.  The  boxed-in  region 
indicates  the  position  of  a  degenerated  52-base  palindrome,  portions 
or  which  are  contributed  by  species  from  several  RNA  superfamilies. 

Key:  Vn  ■  V#  natriegens,  CS  =  Cal yptogena  symbiont  (Stahl  et  ai#9 
1984),  Ta  *  Therwus  aquaticas  (Dams  et  al*f  1983).  Frequency  of 
occurrences  (open  square)  *  1,  (circle)  *  2,  (solid  square)  *  3. 


Sb.  This  is  a  composite  dot  matrix  in  which  the 
palindromic  sequences  from  5  Vibrio  species  have  been  "overlayed"  by 
use  of  a  computer,  and  indicates  the  existence  of  a  degenerated 
34-base  palindrome  (boxed-in  region).  Although  the  palidrome  is  not 
evident  in  the  sequence  of  any  single  Vibrio  species,  composites  of 
numerous  combinations  of  sequences  from  Vibrio  species  yield  the  same 
result.  The  small  solid  arrows  indicate  sequences  coemon  to  V. 
carchariae  and  V.  diazotrophicus.  The  large  solid  arrow  indicates 
sequences  common  to  V.  diazotrophicus  and  V.  alginol yticus.  Open 
arrows  indicate  the  portion  of  the  palindrome  contributed  by  V. 
chalarae. 

Key:  Vch  *  V.  cholerae ,  Vca  *  V.  carchariae ,  Va  *  V.  aiginol yticus,  Vd 
=  V.  diazotrophicus ,  Vn  *  V.  natriegens.  Frequency  of  occurrence:  x  * 
1,  (open  diamond)  =  2,  (solid  diamond)  =  3,  (circle)  *  4,  (solid 
square)  =  3. 


Figure  0.  Dot  matrix  analysis  of  inverted  repeats  (palindroees) .  Dot 
matrices  were  generated  by  comparing  truncated  sequences,  consisting 
of  bases  11  to  110  <of  120)  of  3S  rRNAs  arranged  5*  to  Z*  versus  3’  to 
5f .  The  search  algorithm  was  set  to  ignore  matches  less  than  3 
consecutive  bases  in  length. 


8a«  This  pattern  <boxed-in  region),  is  typical  of  all  of 
the  5S  rRNAs  of  the  genus  Vibrio  sensu  strictuf  and  indicates  the 
existence  of  a  degenerated  33-base  inverted  repeat  (palindrome).  The 
sequence  shown  here  is  from  V,  vulnificus* 


D.  REPEATED  AND  PALINDROMIC  SEQUENCES 

A  computer  program  was  developed  (MacDonell  and  Colwell,  1984, 
manuscript  in  review)  which  searches  nucleic  acid  sequences  -for  repeated 
and  palindromic  sequences,  in  order  to  measure  the  frequency  of 
occurrence  and  extent  of  conservation.  Data  for  5S  rRNA  sequences 
determined  in  this  study  and  for  those  reported  in  the  literature 
(Erdmann  et  ai , ,  1904)  were  analyzed  by  the  program.  Results  of  the 
analyses  revealed  several  palindromes  (inverted  repeats).  Repeated 
sequences,  however,  appear  to  occur  infrequently,  and  are  limited  to 
regions  of  <10  bases.  Dot  matrices  with  palindromic  or  repeated 
sequences  are  shown  in  figure  0.  Two  regions  in  the  base  sequences  of 
5S  rRNAs  from  species  of  the  Vibrionaceae  appear  to  have  been  derived 
from  palindromes.  These  are  (1)  a  54-base  region  extending  from  61A 
through  Gto: 

6C6AUU — G - CUGACU-C-A-GCCG-ACUCA6UC - G — UUAGCG 

and  (2)  a  33-base  region  extending  from  GT3  through  G10+: 

GAU6 — AGUGU - UUU - U6UGA — GUAG 

which  appears  to  have  derived  from  two  smaller  palindromes: 

GAUSGUAG - 

and 

- AUGU6A6AGUA- 

Although  the  sequence  of  the  33“base  palindrome  is  stable  in  all 
Vibrio  55  rRNAs  examined  to  date,  slight  differences  in  the  extent  of 
conservation  of  the  sequence  of  the  54-base  palindrome  were  detected  by 
comparison  of  the  5S  rRNA  sequences.  Five  species  thusly  compared  are 


r.o 


listed  below. 


Table  4.  Relative  Stability  of  Helical 
and  Slngle-etranded  Regions 


Frequency  of  mutation 
per  100  bases _ 


Helix* 


A 

5.2 

B 

12 

c 

0 

C' 

2.4 

B’ 

11.2 

E 

4.3 

D 

2.7 

D' 

3.3 

E’ 

4.3 

A' 

4 

All  helices 

5.3 

Loop  » 

aLb 

2.6 

bLc 

10.9 

cLc' 

15.6 

c'  Lb ' 

0 

b'Le 

0 

eLd 

0 

dLd' 

14.5 

df  Le’ 

0 

e’La’ 

7.2 

All  Loops 

10.9 

Key:  *  Refer  to  Figure  3 


5b).  The  single  difference  in  the  clustering  of  component  phena  is  the 
location  of  the  V.  cholerae  -  V.  vulnificus  cluster.'  The  location  of 
the  H</,  cbolerae  -  </.  vulnif icus*  doublet  in  the  Kimura  dendrogram  more 
closely  resembles  UPG  single  linkage  clustering  (Figure  5a).  The 
composition  and  relative  locations  of  the  Pbotobacteri u»  cluster,  the  A, 
Media  *  A.  hydropbil a  -  </.  psychroerythrus  cluster,  and  the  </.  dawsela  - 
V.  anguillaruM  doublet  are  identical  to  that  predicted  by  UPGMA 
analysis. 

C.  CONSERVED  AND  HYPERVARI ABLE  REGIONS 

As  suggested  by  Fox  et  al .  (1977),  certain  regions  of  the  5S  rRNA 
molecule  are  characterized  by  a  high  degree  of  sequence  variation 
(hypervari able  regions),  whereas  other  regions  were  apparently  resistant 
to  mutation.  Locations  of  the  hypervari abl e  regions,  detected  by 
analysis  of  the  26  5S  rRNA  sequences  determined  in  this  study  are  as 
follows  (listed  in  order  of  increasing  stability):  helix  B/B',  the  base 
40  -  48  region  cf  loop  cLc',  loop  e'La',  base  pair  83/130,  and  to  much 
lesser  degrees,  helix  D/D'  and  loop  did9  (Figure  3).  Highly  conserved 
regions  in  the  sequences  of  55  rRNAs  were  detected  at  (in  order  of 
increasing  variability)  region  C  of  helix  C/C',  base  49  -  58  region 
overlapping  loop  cLc'  and  helix  C/C',  base  87  -  100  region  overlapping 
helices  D/D'  and  E/E',  and  terminal  helix  A/A'  (Figure  3).  The  relative 
stabilities  of  base  sequences  associated  with  these  regions,  expressed 
as  percent  stability,  are  listed  in  Table  4. 


Tab  1*3.  Genera  sensu  strictui  Vibrio  and  Photometer  i  ub 


Vibrio  sensu  strictu 1 

V.  alginol yticus 
V.  carchari ae 
V.  cholerae 
V ,  "Cincinnati i M 
V.  d i azotrophi cus 
V.  fluvialis 
V ,  gazogenes 
V.  harveyi 
V .  Betschn i kov ii 
Vs  wiaicus 
V.  natriegens 
Vs  parahaeBol yticas 
V,  vulni  ficus 


Vibrio  species  not  fitting  into 

Vs  angui 1 1 aruB 
Vs  dawsela 

Vs  fischeri  (see  Photobacteri u») 
Vs  Loqei  (see  PhotobacteriuB) 

Vs  Barinus 

Vs  psychroerythrus  (see  Results) 


Key: 


Photobacteri ub  sensu  strictu 1 

Ps  angustua* 

Ps  fischeri 
Ps  leiognathi 
Ps  logei 
Ps  phos phoreuB 


s  Vibrio* 


1  On  a  jasis  of  5S  rRNA  phylogeny 

a  Possibly  a  synonym  or  biovar  of  Ps  leiognathi  (see  Results) 

3  The  following  named  Vibrio  species  have  not  yet  been  evaluated: 
Vs  caBpbel 1 i ,  Vs  costicola,  Vs  furnisii ,  V.  hollisae t 
Vs  nereis ,  Vs  ni grapul chri tudo ,  V ,  ordalii,  V.  oriental  is, 
Vs  pelagius.  Vs  proteol  yti cus ,  V.  splendidus 


Figure  9.  Evolutionary  tree  baaed  on  palindromic  sequence  analysis. 
This  dendrogram  is  the  graphic  representat i on  o f  one  simultaneous 
solution  for  the  eight  groups  of  “sequence  drift“  observed  for  the 
54-base  palindrome  (see  Table  5).  Since  the  54-base  palindrome  is 
most  conserved  m  the  55  rRNA  from  di  azotro  pbicus  and  most 
randomized  :n  that  from  A*  hydrophil a9  it  is  suggested  that  V. 
di azotrophic as  may  represent  the  least  highly  evolved  of  the  Vibrio 
species.  Three  bifurcations  are  evident,  indicating  the  existence  of 
“i nteraedi ate“  conaon  ancestors,  suggesting  a  high  level  of 
relatedness  between  V.  carcbariae  and  </,  parahaewol  yticus;  O'.  fischeri 
and  A.  pair: f acien*;  and  V,  silica*  and  V.  flaviaii*. 


P.  hydroihil j 


E.  NUCLEASE  SI  LIMITED  DIGESTS 

Nuclease  SI  (from  Aspergill  us  oryzae)  preferentially  hydrolyzes 
si ngle-stranded  <ss)  regions  in  nucleic  acids,  with  relative  affinities 
of  approximately  2000:1  (sssds)  compared  with  double-stranded  regions 
(Vogt,  1980) •  Limited,  i.e.,  “si ngl e-hi t *  digests  of  3aP-5S  rRNA,  using 
nuclease  SI,  were  separated  on  thin  sequencing  gels  alongside 
conventional  sequencing  ladders  in  order  to  map  the  locations  of 
nucleotide  bases  participating  in  the  formation  of  single-stranded 
regions  (Figure  10).  Using  this  method,  composite  maps  were  generated 
for  5S  rRNAs  from  P ,  shigel 1 oides ,  V.  alginolyticus ,  V.  diazotrophicus , 
V.  natriegen*,  and  V.  psychroerythrus .  Based  on  these  data,  the 
following  consensus  map  was  constructed  (using  the  V.  diazotropbicus 
sequence) : 

UKCUGGCACf  C]3j§  6Ctt[uUUj  66ACCCACCU6AUUUCfc  fe)6Cl  C6AACUCA 
|  6^>6lC8CC6!M \ 66 @ 6U6U6666^CCCC@u] 6Uftj^ili66A&l UC6CCM6CAU 
Where  broken  boxes  identify  regions  which  were  occasionally  clipped 
(frequency  <  SOX);  solid  boxes  identify  regions  which  were  frequently 
clipped  (frequency  >  507.) ;  and  hatched  areas  identify  regions  which  were 
invariably  clipped  (frequency  *  100X). 

No  significant  differences  in  hydrolysis  patterns  were  detected 
over  a  wide  range  of  enzyme-to-substrate  ratios  (10a-fold  range); 
incubation  times  (30  seconds  to  6  minutes);  incubation  temperatures  (10 
to  35  C);  or  5S  rRNA  species  employed  as  a  substrate. 

F.  FREQUENCY  OF  OCCURRENCE  OF  "STOP"  AND  "RNY"  CODONS 

Since  the  statistical  distribution  of  “stop"  (UAA,  UAG,  UGA),  and 
’RNY"  (purine,  any  base,  pyrimidine)  codons  in  a  reading  frame  is 


Figure  10.  Nuclease  Si  hydrolysis  ladders.  Limited  nuclease  SI  digest 
were  run  alongside  sequence  ladders  in  order  to  indicate  the  locations 
of  single-stranded  regions.  Nuclease  SI  and  enzymatic  sequence  ladders 
were  generated  from  limited  digests  of  5S  rRNA  from  V.  logei  and  P. 
leiogmthi .  Identity  of  the  endonuclease  lanes  are  as  indicated  in 
figure  2.  Arrows  mark  the  positions  Df  lanes  resulting  from  limited  SI 
digestion.  See  text  for  composite  SI  digest  nap. 


diagnostic  for  the  existence  of  a  direct  coding  function  (Shepherd, 

1981;  1982),  a  computer  program  was  written  (appendix  A)  to  search  each 
reading  -frame,  in  turn,  and  compile  the  frequency  of  occurrence  of 
"STOP"  and  MRNYH  for  each.  Results  obtained  from  application  of  the 
search  algorithm  to  5S  rRNA  sequences  determined  in  this  study  are 
presented  in  Table  6.  Of  the  28  species  of  the  Vibriontceie  for  which 
5S  rRNA  sequences  have  been  determined,  only  A.  hydrophi} a,  A.  media, 
and  </.  psychroerythras  had  at  least  one  stop  codon  in  each  reading 
frame.  The  average  frequency  of  occurrence  of  ‘•stop"  codons  in  each  of 
the  three  reading  frames  was  6.6%  (frame  1),  10. 8X  (frame  2),  and  1.45X 
(frame  3).  The  expected  frequency,  based  on  random  occurrence,  is  3/4* 
or  4.7X.  The  average  frequency  of  occurrence  of  “RNY*  codons  in  each 
reading  frame  was  22. 67.  (frame  1),  18. 7X  (frame  2),  and  34. 9%  (frame  3). 
The  probability  of  occurrence  of  "RNY"  codons  in  a  random  sequence  is 
1/2*  or  25X.  By  comparing  each  observed  frequency  with  that  expected 
from  a  randon  sequence,  percent  change  < ^ X)  values  for  each  reading 
frame  were  determined.  These  are  presented  in  the  form  of  a  histogram 


Tibi*  6.  Occurrence  o<  'RNY*  and  ‘STOP1  Codon*  in  3S  rfiNA  Reeding  Fria*» 

•STOP*  CODONS  ‘STOP’  FREO.  'RNY*  CODONS  ‘RNY*  FREO. 
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Values  expected  for  randoa  sequences:  'STOP*,  4.6X  (1.8/eolecule);  *RNY‘,  23X  (?. 73/aolecule) 


Figure  11.  Histogram  of  "RNY"  and  "STOP"  codon  frequency.  The  data 
for  occurrence  of  "RNY"  and  "STOP"  codons  (Table  6)  are  presented  as  a 
percent  of  expected  randoa  frequency.  Reading  frames  1  (starting  with 
the  S’  terminus)  and  2  both  exhibit  (1)  an  increased  frequency  of 
"STOP"  codons  and  (2)  a  decreased  frequency  of  "RNY"  codons.  Reading 
frame  3,  however,  contains  almost  no  "STOP"  codons,  and  a 
significantly  increased  number  of  "RNY"  codons,  suggesting  a  conserved 
direct  coding  function. 
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DISCUSSION 


A.  MOLECULAR  BASIS  OF  PHYLOGENETIC  INFERENCE 

A  logical  interpretation  of  the  phylogenetic  data  presented  here 
requires  the  following  assumptions:  that  prokaryotic  species  are 
•onophyleticf  and  that  eubacterial  species  share  a  common  ancestor. 

Both  are  reasonable  assumptions  since  there  are  no  data  available 
suggesting  otherwise.  The  first  assumption,  that  prokaryotes  are 
monophyletic,  suggests  that  prokaryotes,  unlike  eukaryotes,  evolved  as 
cells,  and  not  assemblages  of  organelles  or  sub-cellular  components  with 
independent  phylogenetic  histories.  Therefore,  phylogeny  of  a  part,  the 
ribosome  for  example,  should  reflect  the  phylogeny  of  the  whole,  or  that 
the  phylogeny  of  bacterial  ribosomal  RNA  is  equivalent  to  the  phylogeny 
of  the  bacterial  cell.  Although  true  for  prokaryotes,  this  is  not 
believed  to  be  the  case  for  eukaryotes,  since  there  is  ample  evidence 
suggesting  that  chi oropl asts,  mitochondria,  and  nuclei  all  are  of 
distinct  and  independent  prokaryotic  phylogenies. 

The  second  assumption,  in  part  related  to  the  first,  suggests  that 
all  prokaryotic  species  share  a  common  origin.  Although  difficult  to 
prove,  there  is  evidence  to  support  this  view.  For  example,  only  a 
single  kind  of  protein  synthesis  apparatus  occurs  in  all  known  forms  of 
life  (Vogel,  et  ai.,  1984).  Furthermore,  tRNAs  and  ribosomal  RNAs  from 
the  broadest  possible  range  of  species  share  striking  similarities, 
strongly  suggesting  a  common  origin.  Since  the  sequences  can  be 
considered  highly  conserved  over  a  time  frame  of  more  than  a  billion 
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years,  nucleotide  base  sequences  of  nbosomal  RNAs  provide  the 
information  from  which  evolutionary  relationships  can  be  deduced. 
Comparisons  among  sequences  of  ribosomal  RNAs  appear  to  be  ideal  for 
inference  of  phylogenetic  relationships  (see  Kuntzel  et  ai.f  1981; 
Kuntzel,  1982;  Kuntzel  et  el .  f  1983).  Unf ortunatel y,  the  number  of 
known  5S  rRNA  sequences  is  small,  with  only  43  eubacterial  sequences 
published  to  date  (Erdmann  et  el . ,  1984;  Dekio  et  el . ,  1984;  MacDonell 
and  Colwell,  1984a, b, d, e) ,  including  21  Gram-negative  species.  This  is 
too  few  to  establish,  conclusively,  a  phylogeny  for  eubacteria. 

A  major  reason  for  the  small  number  of  eubacterial  5S  rRNA 
sequences  presently  available  for  analysis  is  that  more  emphasis  has 
been  placed  on  investigation  of  deep  phylogenetic  branching  than  of 
phylogenetic  relationships  at  the  genus  and  species  level.  The  focal 
point  has  been,  primarily,  the  saltation  of  archaebacter i a  and 
eubacteria,  and  deep  branches  within  each  group  (Stackebrandt  and  Woese, 
1984).  In  addition,  several  laboratories  undertaking  pioneering  work  on 
comparative  sequencing  of  ribosomal  RNA  as  a  phylogenetic  probe  have 
focussed  on  characterization  of  archaebacterial  species  (see  Fox  et  el . , 
1982;  Woese,  1982).  Clearly,  methods  for  comparative  sequencing  of 
ribosomal  RNAs  are  just  now  being  developed.  It  is  too  soon  to  expect 
large  compilations  of  rRNA  sequences,  but  the  situation  will  alter  in 
the  Immediate  future. 

The  main  objective  of  this  study  was  to  analyze  rel at i onshi ps  among 
specie*  of  a  single  taxon  of  the  eubacteria,  the  family  Vibrionaceee, 
and,  thereby,  provide  information  needed  for  resolution  of  several 
issues.  The  first,  and  most  immediate,  was  the  clarification  of 
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taxonomic  relationships  among  the  36  species  of  the  </ 1  br  lonaceae  and 
assessment  of  the  utility  of  the  polyphasic  approach  to  bacterial 
systematic*,  upon  whose  foundations  the  taxonomy  of  the  Vibriontceae 
(Citarella  and  Colwell,  1970;  Colwell,  1970;  Colwell,  1971)  was 
constructed.  The  second  was  to  evaluate  the  ability  of  comparative 
sequence  analysis  to  clarify  phylogenetic  relationships  at  the  species 
level.  Lastly,  and  most  important,  considering  the  impact  on  bacterial 
systematics,  was  to  provide  a  basis  Tor  selection  of  phenotypic 
characters  correlated  with  phylogeny  for  the  Vibrionaceae,  the  purpose 
being  replacement  of  tables  of  key  characteristics  constructed  from  a 
priori  assumptions  of  phylogenetic  relationships. 

Several  methods  exist  for  estimation  of  evolutionary  relationships 
among  nucleic  acid  sequences,  based  on  sequence  similarities  (Klotz  et 
al  . ,  1979;  Kimura,  1980;  Li,  19B1).  These  derive,  for  the  most  part, 
from  modifications  of  clustering  of  difference  matrix  data  by  unweighted 
pair-group  <UP6)  algorithms.  The  modifications  are,  in  general,  based 
on  empirical  or  statistical  models  which  correct  for  the  tendency  for 
transitions  (purine  to  purine,  or  pyrimidine  to  pyrimidine)  to  occur 
with  significantly  greater  frequency  than  transver si ons  (purine  to 
pyrimidine,  or  pyrimidine  to  purine)  (Kimura,  1980);  different  rates  of 
mutation  (Klotz  et  al.f  1979;  Li,  19B1);  or  compensate  for  lack  of 
reference  ancestral  or  prototype  nucleotide  sequences  (Kimura,  1980;  Li, 
1981).  Application  of  the  UPG  method,  despite  its  inherent  inability  to 
compensate  for  the  interactions  mentioned  above,  produced  virtually  the 
same  clustering  (Figure  3b)  as  the  Kimura  algorithm  (Figure  7),  the 
latter  being  considered  by  many  investigators  as  the  most  appropriate 
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for  nucleic  acid  sequence  comparisons.  Either,  however,  appears  to  be 


more  favorable  for  estimation  of  phylogenic  relationships  than  SA* 
coefficients  (Woese,  1982;  Noese  et  *l.f  1984;  Stackebrandt  and  Woese, 
1984)  commonly  employed  in  16S  rRNA  studies.  A  correlation  between 
values  and  sequence  homology  has  not  been  shown,  since  S*»  values 
greater  than  0.4  tend  to  overestimate  homologies  significantly,  whereas 
values  below  0.4  tend  to  underestimate  to  the  same  degree  (Hori  and 
Osawa,  1982).  The  result  is  an  unacceptable  skewing  of  evolutionary 
distances  that  are  estimated  using  the  coefficient,  not  altogether 
surprising,  since  the  was  developed  to  estimate  evolutionary 

distances  from  oligomer  catalogs.  Were  methods  for  direct  determination 
of  nucleotide  base  sequences  available,  a  more  robust  coefficient  might 
have  resulted. 

B.  PHYLOGENY  OF  THE  V I BR I ONACE AE 

Unweighted  pair-group  (UPG)  analysis,  using  single-,  average-,  and 
complete  linkage  (Figure  3a-c),  as  well  as  estimation  of  evolutionary 
distance,  employing  the  Kimura  (1980)  algorithm  and  comparative  sequence 
data  (Figure  7),  were  applied  to  construct  evolutionary  trees.  As 
expected,  the  UPG  single-,  and  UPG  complete  linkage  analyses  were  of 
limited  utility  in  defining  phylogenetic  relationships,  although 
helpful,  in  some  cases,  such  as  analysis  of  relationships  between  O', 
*nguil 1 *ru*  and  the  major  Vibrio  cluster.  Dendrograms  generated  using 
SS  rRNA  sequence  comparisons  and  UPGfiA  (average  linkage)  and  using 
estimates  of  evolutionary  distance  (Kimura,  1980)  were  in  remarkably 


good  agreement,  suggesting  that  most  of  the  named  Vibrio  species  share  a 
recent,  common  ancestor.  These  species  are  concluded  to  comprise  the 
genus  Vibrio  sensa  strictu  (MacDonell  and  Colwell,  1984b).  V, 
angui 1 1  stub  and  V.  daasela,  and,  presumably,  V.  ordalii  and  V, 
tubiashii f  which  were  considered  previously  to  be  biovars  of  </. 
anguill arua  are  concluded  to  comprise  a  separate  genus. 

V.  aarinusf  based  on  the  criterion  of  comparative  analysis  of  5S 
rRNA  sequences,  shares  only  a  remote  common  ancestor  with  any  of  the 
named  Vibrio  species  (MacDonell  and  Colwell,  1984b),  and,  as  a 
consequence,  the  3S  rRNA  sequence  is  concluded  not  to  permit  clustering 
of  this  species  with  those  of  either  of  the  two  groups  described  above. 
Whether  V.  aarinus  constitutes  a  species  of  the  family  Vi bri onaceae 
sensa  strictu  requires  further  study. 

Motivated  by  the  observation  of  Van  Landschoot  and  DeLey  (1903), 
based  on  rRNA/DNA  hybridizations,  that  Plteroaonas  putrif aciens  may 
share  a  moderately  recent  common  ancestor  with  the  genus  Vibrio ,  the 
sequence  of  the  55  rRNA  from  that  species  was  determined.  Although  it 
failed  to  cluster  with  species  of  either  the  genus  Vibrio  sensa  strictu 
or  V.  anquil 1 arua  and  V .  daasela ,  it  shares,  with  both  groups,  a  level 
of  relatedness  sufficiently  high  to  suggest  that  it  probably  is  a  member 
of  the  Vibri onaceae.  Interestingly,  A.  putrif aciens  shares  a  common 
ancestor  with  two  abyssal  marine  isolates,  UM40  and  W145  (Deming  et  ai,, 
1984)  at  an  phylogenetic  depth  suggesting  a  mutual  rel atedness  at 
approximately  the  family  level. 

The  genus  Photobacteriua  comprises  at  least  four  species,  P. 
pbosphoreua,  P,  leoignathi ,  V.  f i scheri ,  and  V.  logei.  P.  angustua 
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possesses  a  55  rRMA  the  sequence  of  which  is  identical  to  that  of  P. 
le  lognathi ,  suggesting,  but  not  proving,  they  may  be  biovars  of  the  same 
species.  Although  there  is  no  reason  why  two  species  cannot  share  an 
identical  5S  rRNA  sequence,  it  appears  likely  that  the  time  s~ale  in 
which  a  common  ancestor  gives  rise  to  two  distinct  species  is 
sufficiently  large  that  one  (or  several)  base  differences  out  of  the  120 
comprising  the  5S  rRNA  molecule  would  result.  The  rate  of  spontaneous 
mutation  in  5S  rRNA  sequences  is  in  the  order  of  10"A  years  (Stahl  et 
al . ,  1984). 

V.  psychroer ytbrus  was  originally  described  by  D'Aoust  and  Kushner 
(1972)  with  the  suggestion  that  it  be  allocated  to  the  genus  Vibrio, 
although  it  was  never  properly  validated  as  a  Vibrio  species.  From 
comparison  of  its  55  rRNA  sequence  with  those  for  the  other  species  Df 
the  Vibrionaceae  included  in  this  study,  it  is  concluded  to  be  misnamed. 
The  only  species  with  which  it  shares  other  than  the  most  remote  common 
ancestor  are  A.  bydrophi 1  a  and  A.  eedia,  which,  based  on  55  rRNA 
sequences,  as  well  as  rRNA/DNA  hybri di r at i ons  (J.  DeLey,  personal 
communication)  cannot  be  considered  members  of  the  V ibrionaceae .  A 
proposal  to  create  a  new  family  composed  of  the  species  V. 
ps ychroerythrus ,  A.  bydrophil a ,  A.  wedia  and  related  species  is  being 
prepared  (hacDonell  and  Colwell,  manuscript  in  prepar ati on ) . 

The  correct  taxonomic  position  of  R.  sbigel loides  has  been  a  topic 
of  controversy  since  the  original  description  was  proposed  by  Fergusen 
and  Henderson  (1947),  and  has  been  shifted  back  and  forth  between  the 
families  Enterobacteri aceae  and  {/ibrionaceae  (Hendrie  et  al . ,  1971). 

The  reason  for  the  apparent  difficulty  in  resolving  the  taxonomic 
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position  of  this  species  is  that  it  possesses  phenotypic  c har ac t er i 5 t 1 c s 
considered,  a  prior;,  to  be  ch ar ac t er 1 st 1 c  of  each  family.  The  5S  rRNA 
sequence,  however,  appears  to  be  useful  in  resolving  the  controversy 
since  an  extensive  homology  with  the  5S  rRNA  from  ^roteuj  wirabil is 
(Entercbacteriaceae )  was  observed.  In  fact,  it  is  evident  that  Proteus 
Mi rabil is  shares  with  PlesiOMonas  shi gell oides  a  much  more  recent  common 
ancestor  than  with  Proteus  vulgaris  (figure  12).  It  is  concluded, 
therefore,  that  P .  shigel loides  should  be  reassigned  to  the  Proteae . 

Compared  with  the  classical,  i.e,,  alpha,  taxonomy  for  other 
bacterial  groups,  viz.,  Pseudoeonadaceae  and  Bacillaceae ,  the  taxonomy 
of  the  V ibr lonaceae*  derived  extensively  from  application  of  a 
polyphasic  approach,  which  now  includes  5S  rRNA,  is  remarkably  consonant 
in  all  phases.  In  fact,  had  it  been  known  that  the  DNA  base 
composition,  in  the  45  to  51  mol’/.  GC,  is  a  phylogenetic 

char acter 1 st ic  of  the  genus  Vibrio  sensu  strictu9  the  polyphasic 
approach,  without  the  55  rRNA  data,  would  have  delineated  accurately  the 
genera  of  the  V ibr 1 onaceae .  Even  without  prior  knowledge  of  the  utility 
of  DNA  base  composition  for  construction  of  a  taxonomy  of  Vibrio  spp.9 
the  polyphasic  approach  has  clearly  provided  a  means  of  estimating  a 
phylogenetic  taxonomy. 

C.  PALINDROMIC  SEQUENCES 

Since  the  limit  of  resolution  of  comparative  sequence  analyses  of 
bacterial  5S  rRNAs  occurs  at  the  spec  1 es-b 1 ovar  boundary,  it  was 
speculated  that,  if  repeated  or  palindromic  sequences  exist  in  the  5S 


Figure  12.  Evolutionary  relationships  between  P.  shigelloides  and 
Proteus  fpp**  Sequence  homologies  among  5S  rRNAs  froa  the  species 
PI  es iomones  shigelloides  f  Proteas  winbil  is,  and  Proteus  vulgeris , 
analyzed  by  UPGMA,  indicate  that  among  the  three  species,  the  most 
recent  common  ancestor  is  shared  by  P.  shigel  l  aides  and  P-  winbil  is • 
From  this,  as  well  as  overall  homology  between  the  sequence  of  the  55 
rRNA  from  P,  shigel loides  and  those  from  Proteus  species,  it  is 
concluded  that  P.  shigel loides  is  mis-named,  and  should  be  allocated 
to  the  Proteee. 


Rrj 


rPNA  primary  structure,  it  might  be  possible  to  map  mutations  in  a 


sequential  fashion  and  in  such  a  way  that  insight  into  evolutionary 
relationships  at  the  level  of  biovar  could  be  gleaned.  It  was 
gratifying,  therefore,  to  discover  that  several  degenerated  palindromic 

l 

(repeated  invert)  sequences  do  exist  in  Vibrionaceae  5S  rRNA.  While  the 
simultaneous  solution  of  the  course  of  sequential  mutations  in  the  least 
highly  conserved  palindrome  (figure  9)  appears  to  lack  the  sensitivity 

i 

necessary  to  define  relationships  at  the  biovar  level,  it  did 
corroborate  evolutionary  relationships  suggested  by  UPG  analysis  using 
difference  matrix  and  K(nuc)  values.  It  is  probable  that  after  a 

i 

sufficient  number  of  5S  rRNA  sequences  from  a  wide  range  of  bacterial 
species  have  been  determined,  palindromic  sequences  may  yield  a  unique 
method  for  analyzing  evolutionary  trees  based  on  nucleic  acid  data. 

Even  though  there  is  no  fossil  record  of  ancestral  nucleic  acid 
sequences,  perfect  palindromes  presumably  may  be  considered  ancestors  of 
degenerated  sequences. 

D.  NUCLEASE  SI  MAPS 

Nearly  two  decades  ago,  the  first  5S  rRNA  was  sequenced  (Brownlee 
et  al.$  1967).  Since  that  time  dozens  more  sequences  have  been 
determined.  Yet,  the  secondary  structure  of  5S  rRNA  remains  to  be 
resolved  fully  (Nailer,  1984).  Most  attempts  to  reveal  the  secondary 
structure  of  5S  rRNAs  have  relied  on  base  pairing  schema  for  insight 
into  “unique-  structures  (Txnoco  et  a/.,  1971;  Dams  et  a/.,  1982;  Dams 
et  a/.,  1903;  DeWachter  et  ai.,  1982;  Pieler  and  Erdmann,  1982).  See 


figure  13.  Unfortunately,  no  single  "permissible"  base-pairing  scheme 
exists.  For  example,  Trifonov  and  Bolshoi  (1983),  with  the  aid  of  a 
computer  to  overlay  dozens  of  35  rRNA  sequences,  demonstrated  that  two 
relatively  unrelated  secondary  structures,  termed  "V-fora"  and  "P-form", 
appear  equally  probable.  Computer  analysis  of  free  energy  of  the 
secondary  structure  of  each  of  these,  using  Ninio's  rules  (Ninio,  1979; 
MacDonell  and  Colwell,  19B4c),  indicates  that  both  represent 
thermodynamically  stable  structures.  There  have  been  several  chemical 
approaches  taken  to  resolve  the  55  rRNA  secondary  structure,  including 
NMR  (Kime  and  Moore,  1982;  19B3b),  nuclear  Overhauser  effect  (Kime  and 
Moore,  1983a),  and  X-ray  scattering  (Leontis  and  Moore,  1984).  The 
concensus  of  all  is  that  the  A/A",  D/D"  and  E/E*  regions  (Figure  13) 
fore  a  single  cylindrical  rod  (helix),  which,  although  informative,  does 
not  contribute  a  major  insight  into  the  secondary  structure  problem. 

It  can  be  speculated  that  since  5S  rRNA  is  a  relatively  small 
molecule,  with  at  least  two  thermodynamically  stable  secondary 
structures,  it  may  possess  a  modulatory  function.  If  this  is  true,  it 
would  not  be  expected  to  possess  a  unique  secondary  structure,  but 
switch  between  two  (or  more)  structures.  In  light  of  this  speculation, 
results  of  the  nuclease  SI  limited  hydrolysis  studies  (Figure  10)  are 
encouraging,  since  they  suggest  the  existence  of  three  types  of  regions 
associated  with  base-pai r i ng i  helical,  single-stranded,  and 
"intermediate*.  Helix  regions  (never  hydrolyzed  by  SI),  and  loop 
regions  (always  hydrolyzed  by  Si)  are  general  features  of  several 
secondary  structure  models  (DeWachter  et  al . ,  1982;  Pieler  et  ai  . ,  1982; 
1904;  Trifonov  and  Bolshoi,  1983),  whereas  “intermediate"  regions 


Figure  13.  Representation  of  severs!  SS  rRNA  secondary  structure 
aodels.  The  proposed  secondary  structure  models  of  <a)  DeWachter  et 
ai .  <1982),  (b)  Fox  and  Woese  <1975),  (c)  Trifonov  and  Bolshoi  <1903) 
(Y-form  model),  (d)  Trifonov  and  Bolshoi  (1983)  (P-form  model),  and 
<e)  MacDonell  and  Colwell  <1984d)  are  presented  for  comparison. 
Although  each  of  these  models  is  supported  by  experimental  evidence, 
no  single  model  is  adequate  to  explain  the  accumulated  observations 
from  secondary  structure  analyses.  It  is  probable  that  the  5S  rRNA 
molecule  does  not  possess  a  single  secondary  structure,  and  that  its 
role  may  be  modulatory,  shifting  between  2  (or  more)  conformations. 
Helices  are  identified  as  in  Hori  and  Osawa  (1979).  The  5S  rRNA 
sequence  is  that  of  </.  fluvialis  (MacDonell  and  Colwell,  1904e) 
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(occ assi onal 1 y  hydrolyzed  by  SI)  are  represented  as  helices  in  some 
models  and  loops  in  others,  best  seen  by  comparing  the  *P-form"  model 
(Trifonov  and  Bolshoi,  1983)  with  the  "3-helix*  model  (DeWachter  et  si . , 
1982),  Therefore,  rather  than  confirm  a  particular  secondary  structure 
for  the  5S  rRNA  molecule,  it  is  concluded  that  the  nucleate  SI  maps 
reported  here  provide  significant  indirect  evidence  for  a  modulatory 
role. 

E.  A  CONSERVED  READING  FRAME  IN  5S  rRNA 

Shepherd  (1981;  1982)  showed  that  a  relatively  high  frequency  of 
"RNY*  codons  occurs  in  reading  frames  having  a  direct  coding  function, 
and  that  these  may  represent  a  vestige  of  an  ancestral  comma-less  code. 
Sixteen  such  codons  specify  B  amino  acids.  Although  none  of  the  3 
reading  frames  of  16S  or  23S  rftNAs  contain  other  than  the  statistical 
distribution  of  either  "RNY"  or  "STOP*  (UAA,  UAG,  UGA)  codons,  one  frame 
(third)  in  5S  rRNAs  was  found  to  contain  a  significantly  large  number  of 
MRNY*  and  virtually  no  "STOP*  codons  (Erdmann  et  el.,  1 983 ;  Colin 
Clarke,  personal  communication).  Since  the  number  of  5S  rRNA  sequences 
determined  in  this  study  is  approx J mately  equal  to  the  number  o4 
published  3S  rRNA  sequences  from  which  such  findings  were  compiled,  the 
sequences  reported  herein  were  screened  for  frequencies  of  "RNY"s  and 
"ST0P"s  (Table  6).  The  results  of  the  frequencies  of  ’STOP*  codons,  in 
each  of  the  3  frames,  in  particular,  is  striking.  Furthermore,  in  each 
o4  the  3S  rRNAs  prepared  from  species  of  the  genus  Vibrio  sensu  strictu , 
a  frame  with  no  "STOP"  codons  occurs,  despite  the  presence  of  a 
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significant  number  cf  point  mutations  in  the  population  as  a  whole. 
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Such  a  strongly  conservated  stop-less  reading  frame  justifies  the 
speculation  that  frame  #3  of  3S  rRNA  may  retain,  or  participate  in,  a 
direct  coding  function.  The  hypothetical  39  amino  acid  polypeptide 
coded  for  by  the  5S  rRNA  from  fluvialis ,  for  example,  would  bei 


S’-Pro-Bly-Asp-His-Ser-Vil-Leu-Asp-Pro-Pro-Asp-Ser-Ilt-Pro-Asn-Ser-filu-Val-lys 
3*  -A1  a -61  n-Arg-Hi  $-61  u-Val -Ar  g-Val  -Hi  s-Pr  o-Phe-61  y-Val  -Ser-61  y-Asp-Val  -Ser-Asn-Arg 

Because  of  the  overall  conservation  of  the  nucleotide  base  sequence  of 
5S  rRNAs,  only  a  small  number  of  amino  acid  residues  are  susceptible  to 
effects  associated  with  point  mutations.  Taking  this  information  into 
account,  a  concensus  polypeptide  sequence  for  the  Vibrionaceae  can  be 
written,  as  follows: 

5’-Pro-61y-Asp-His-Sfr-)(-Y-A5p-Pre-Pro-Asp-5er-(Ile/Het)-Pro-A5n-Ser-61u-Va)  -lys-Ar 
3’  -Ala-61  n-Arg-His-61  u-Val  -Arg-Val-His-Pro-PtiHjly-Val-Ser-61y-Asp- ( Val  / A3  a)-5er-(Asn/Ilt 
Ifriere  I  aay  be  Val,  Asp,  Cys,  or  lie,  and  Y  aay  be  Leu,  Val,  or  Ptie. 


Whether  or  not  this  polypeptide  species  (or  an  analog)  occurs  in 
the  bacterial  cell  remains  to  be  determined,  and  if  extant,  a 
function(s)  will  have  to  be  identified,  no  doubt  contributing  even 
greater  understanding  of  the  evolution  of  the  prokaryotes. 
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APPENDIX  A. 


The  following  are  original  BASIC  language  computer  programs 
employed  in  the  management  and  analysis  of  nucleic  acid  data  generated 
in  the  course  of  this  study. 

1.  program  to  generate  dot  matrices  from  RNA  sequences.  ....  .95 

2.  program  to  search  reading  frames  for  RNY  and  STOP  codons  .  .  .97 

3.  program  for  pairwise  comparisons  of  nucleic  acid  sequences  .  .98 

4.  program  to  assign  K(nuc)  values  to  pairs  of  RNA  sequences.  .  .99 

5.  program  to  determine  free  energy  using  Nimo's  rules  ....  103 

6.  program  for  data  management;  sequence  and  file  handling.  .  .  109 
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127  IFA»*’UAAA’THE)I16 

128  IFA**"OAAG*THEI416 
12?  IFA4*"AUAA’THEN16 

130  IFA4**AU6A'THEH16 

131  IFA4*'AUA6*Tl€Wfc 

132  IFAf»*CGGUT’T}£N17 

133  IFA*»*6CU6T*THEI*1B 

134  IFA*»*U6UA’THEN1B 

135  1FA*»*U6C6*7>®D 
137  IFA4*”TNEN150 
140  CLS:60T025 

150  NMsBf*’’ 

151  CIS 

J55  PRINT*  = 

160  PRINT*  ADJUST IB4T  FOR  HAIRPINS  Hairpin  t’N 
16S  PRINT* 

170  PRINTV999  «  ESCAPE)  A6»*A:PRINT:IHPUT*NUHB£R  OF  BASES’ ;H 

171  I FH*  99?  THEM  50 

175  INPt/T’NUHBER  OF  ’U’  RESIDU£S’;UH 

180  G0SUB200 

190  MM1:60T0151 

200  IFUH*0THEN290 

210  IFUf  1THEH360 

220  IFH*3TH£NA*A+6.4 

230  IFH»4THENA*A*4.3 

240  1FH-5THENA-A+3.7 

250  IFM6THENA-A+3.9 

260  IFH»7THENA»A+4.3 

270  IFH>7THENA*A+(4.7M (H-8)  10.2) ) :RETURN 
2B0  RETURN 

290  IFH»3THEHA»A+8.4 
300  IFW*4THENA»A*5.9 
310  IFH*5THENA*A*4.9 
320  IFH*6THENA*A*4.7 
330  IF>WTHENA«A*4.7 


1 


SB  JFA**'C6A6’T>£N6 
39  IFfl$»*ftOCC*THEN6 
tO  IFM>'AUG6’TXEM6 

61  IFAf»‘6CUC*THEN7 

62  IFAfKXU’TlCH? 

63  1FA4**1)AAC*TH£N7 

64  IFA^’CKirTHDtf 
63  JFA»»HACA*THEN7 

66  1FA4»*C6UC*T»CN7 

67  !FA4«*AUAC*T1CN7 
66  IFA4**AUCA*THEN7 

69  IFA4**6CUU"TH0(8 

70  IFA6**C66U,THEN8 

71  IFA6**AU6U*THEN8 

72  JFA4*’U66C*THEN8 

73  IFAM’CttJU’THENfl 

74  ifa^'amui'toem 

73  IFA4»*C6C6*THEN9 

76  IFA4»*6C6C*THEN9 

77  IFA4**C6U6*THEN10 
7B  IFAfAUUe’TWtmO 

79  lFA«*’6injA*MN10 

80  IFA4**6U6C*THEH10 

81  IFA$**AU06T*TMEJ410 

82  IFA4**6CGUT,'nO10 

83  IFA^’CRjeriHEHlO 

84  IFA6**UAUA’T)£N11 
83  IFA4»*UAAUT1€M1 

86  IFA4**AUUA*T)£)U1 

87  1FA4**AUAU*THEN11 

88  IFA4*’U6U6,'nCN12 

89  !FA4*,6UGU*THEN12 

90  IFAA**tJACl>*T>0412 
41  lFA»«*UAUC*TfCH12 
92  IFA$*‘AUUC’THEN12 

43  IFA$**A0CU"THEN12 

44  1FA4*,BU6UT*TH£N12 

45  ]FA**"U6UGT“THEM12 

46  IFA^’IrtTWHOtf 

47  IFAt-'BineU’DO^ 

48  IFA»**6UU6*TH£N13 

49  IFA4**UAUU*T)1EN13 

100  IFAI-’AUWTHENi: 

101  IFA^’WUCT’MMS 

102  1FA***U66U*THEJU4 

103  IFA4**U6CU*THEX14 

104  1FA$«*U66A,THEN14 
10!  1FA4**U6A6*THEN14 

106  IFAI**U6AC*T)€N14 

107  1FA4*’U6CC*THEA14 

108  IFA$«*U666,THEN14 
10*  IFAt»*U6UlPT)£Nl4 


w  S" 


0  REJ1  DCTERfllNATIOM  OF  FREE  ENER6Y  OF  SECONDARY  STRICTURE  BY  NINIO’S  RULES 

1  ClSiA»O:60TO23 

2  A*A- 1.2: 60T023 

3  A*A-3.2:60T023 

4  A*A-.2:6DT025 
3  A*A-.8s 60T023 

6  A»A*.2:60T025 

7  A»A*.B:60T025 

8  A*A-.5:60T025 
?  A*A-3.B;60T023 

10  A=A-1:60T025 

11  A*A-1.7:60T023 

12  A*A+1:60T025 

13  AeAO: 60T025 

14  A=*A*1.3:G0T023 

15  A*A-3.3:60T023 

16  A»A-1.2:60T023 

17  A*  A-.  3:607025 

18  A*A-2:60T025 

21  PRINT*  yxiiiiantatTniiri:i;ri;irnt»3»g  * 

25  CIS: PRINT*  COMPUTATION  Of  A&  BY  NINIO’S  RULES* 

26  PRINT*  >»»«««>rnc=ii:ii^=r=;;;=;=rr=:i;:1 

27  PRINT*  Enter  Non-Canonical  Bases  as  2d  Pair* 

29  PRINT* - * 

30  AI«**:PRINT*  A6**A:PRIN*:PRINT8200,**i:INPUTA$ 

31  IFA*»*C6U6*THEN2 

32  IFAI»*6C6U*TJ£N2 

33  IFA»**C66C*TT02 

34  IFA4=*6UC6*T>02 

35  IFA»**"SAU*TT03 

36  IFA$=*6CAU*THEN3 

37  1FA»>*UAC6*TT£N3 
3B  1FA*=*AUC6*THEN3 
Vf  IFA*=*C6UA*THEM3 

40  IFA4**6CUA*TNEN3 

41  IFA*»*UA6C*T}£N3 

42  1FA4**AU6C*THEN3 

43  IFA$e*6CAC*THEN4 

44  1FA4**6CCA*T>04 
43  IFASe*C6CA*THEN4 

46  !FA4**C6AC*THEX4 

47  1FA$«*6CCC*THEN3 

48  IFAI»*6C66*THEN3 
4?  IFA4«*C6CC*THEN3 
50  1FA4**C666*THEN3 

31  IFA4«*6CAA*THEN6 

32  IFA4»*6C6A*THEN4 

33  IFA4»*6CA6*TNEN6 

34  IFAI»*CS6A*T>CN6 

35  1FAI»*UA66*THEN6 

36  lFA4**UACC*TH£Nt 
57  lFAVCSAA'THENb 


I 


10030  IF(W«'V  PSYC'THENA4*PP< 

10031  IFM»*P  FLU0*THENAMFF4 

10032  IF0»«*VP1*THE)W*«HH» 

10033  IFO»»*R  RUBR’THENAf'RM 

10034  IFW-’A  HIDU’THENAO-NDI 
10033  IFD$»*S  LIVI'THENAMl* 

10036  IFW^PROCTHENAWW 

10037  IFDf’T  AOUA*THENA«>M« 

10038  IFWT  THER'THENAO-TT* 

10039  lFM*,,MNPRIHT*STRftIN  HOT  LOCATED  IH  DATABANK  * :  FORI  *  1 TOSOO:  NEXT  1 :  GOTO  1 30 

11000  IFW*'V  PARA,TH£NB4*PA$ 

11001  IFW**V  FLWTHENBI«FL$ 

11002  IFW«*V  CHOL’THENB4*CH1 

11003  !FW«‘V  HARV*T«NM*(tt$ 

11004  IFW*'V  FISC*THE»*«FIO 
11003  IFW**V  HARrTHENBWIAl 

11006  IFRI«'V  WIH*THE»»«VUI 

11007  IFR$»*I(ALVIS,THEHB<*8A4 

11008  IFM*'V  AN6U*THENB4»ANI 

11009  IFM«‘P  PHOS*THEHW«PHI 

11010  IFR4**A  HYDR*THENB*»HYI 

11011  IFM*,RIFTIA’THEHB4»RF4 

11012  IFW«*CALYPT*THENB**CAI 

11013  IFR»*,S0L,THEHB$«S04 

11014  IFW«’P  SHI6‘T)OB**SHI 
11013  IR$**P  MRA,THENB$*RI4 

11016  IFR$«'P  VUL6*T>©IB4*Vli 

11017  IFM«*Y  PEST’THEXB^PEI 

11018  1FM«'A  VIHE’THENB$*V1» 

11019  IFR***P  AEFU’THENBI’AE* 

11020  IFR***E  ADES’THENB**AD$ 

11021  1FW»‘E  AR6S*THENB4*AR$ 

11022  1FR$«*S  IWRC*THENBI=HM 

11023  IFW«*S  TYPH’THEHM'TYI 

11024  1FM**E  C0LI,THEWI*C01 
11023  IFR$«*V  AL6I*THENB4*AL4 

11026  IFR4**V  NATR*THENB4*NN4 

11027  IFM«*V  HInrTHENB1*HH1 

11028  IFPI«*V  CARC’THENB4*CC4 

11029  IFM**V  DIA2*THE)ffl$«DD4 

11030  1FRI»*V  PSYC*THEHB$*PP4 

11031  !FR$»'P  FLU0,THEHB4=FF4 

11032  1FM«,VP1*THEHB»«HH4 

11033  IFRI«*R  RUBR*THENB«»RM 

11034  1FW«‘A  N1DU,THENB4«W)» 

11033  IFW«*S  LIVI*THENB4*LL* 

11036  IFR$»*PRK,THENB**PR4 

11037  IFP1«*T  AQUA’THENB^AA* 

11038  IFR*»,T  THER’THENB4*TT4 

11039  IFR4«**THENPRINT,5TRAIN  NOT  LOCATE  IN  DATABANK* ; FOAI *1 TD300: NEXT  1 : 60T01 31 1 

11040  RETURN 


♦023  SIM/KB 
♦024  SB*  (SA+SI1/2 
♦028  SJ*SAA2 
♦02?  SK-SB-2 
♦030  SL»tSJIP)MSM) 

4031  SIM<SAIP)*<SI»Q))A2 
♦032  SS»(SL-SN)/N 
5145  fHIKT’1/2  K(nuc)  •  *401/2 
3144  PRINT'SEU)  *  *SS 

3147  PRIKT’DUMP  TO  PRINTER?* 

3148  Y4»INK£Y4: IFY1*'*THEN5I4B 
314?  IFY***Y*T*N60SUB6000 
3150  PRINT'COHTINUE?* 

3131  Rt*I8KEY*jIFM*,,THD(3151 

3132  IFF4**Y'THEX5133 
3153  MENU 

5135  CLEAR: 60T06 

4000  LPRI*T*REF  STRAIN  •  *01 

4001  LPRWTEST  STRAIN  *  *R« 

4002  LPRINT’1/2  K(auc)  *  *401/2 

4003  IPRINT’SElk)  *  *SS 

4004  LPfilNT 
4003  RETURN 

10000  A$*,*:BI*":IFQ»**V  PARA*TNENA1=PA» 

10001  IFQ$*’V  FlUVTHENA*=fl1 

10002  1R»**V  CHOL*D©4AO=CH* 

10003  IF0I**V  HARV*THENA1=HA« 

10004  IFQ»**V  FISfTHENA»*FII 

10005  IFW**V  MARI  *T>®4A**HA* 

10006  IF0$*’V  VXN*THEJiA1*VU1 

10007  1FD$**8ALVIS*THENA1=IIA1 

10008  IFO»**V  AH6U*THENA1*AN1 

10009  IFQ$**P  PHOS*THENA4=PH1 

10010  IFOI**A  HYM’THENAIsHYl 

10011  IFM**RIFTIA‘THENA1*RF4 

10012  IFO$*,CALYPT,THDtt»*CA« 

10013  lFW**Stn.*THENA4*SO» 

10014  IFB$**P  SHI6"THENA1sSH1 
10013  IF0$«*P  HIRA*THENA>*M* 

10016  IFQ»«*P  VUL6’THEXA4*V14 

10017  IFB$**Y  P£ST'THENA4*P£4 

10018  IF0$**A  VINE'THENA4*VI4 

10019  IFB*«*P  AERU*THENAI*AE4 

10020  IFD»«*E  ADES*THENA1»AD* 

10021  IF0I*‘E  AR6S'TNENAI«AR1 

10022  1F01*'S  NARC’THENAI'HRI 

10022  IFQ$**S  TYPH*THENA1=TY1 
10024  IF0»**E  C0LI*THENA1*Ct)4 

10023  IF04«*V  AL6I‘THENA»*AL» 

10026  !F04«'V  NATR,THENA$*NN4 

10027  IF04**V  NIHI*THENA1»ffl* 

10028  1FQI»*V  CARC"THENA4*CC$ 

10029  lFQi**V  D!AI*THENA4»DD4 


i 
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k 


k 

k 
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lit  lNNJT'TMt  Tuon'jRt 

132  PRJNT:PRlNT,*orking...,:60SUB10000 

133  60T03000 

200  IF  HIOMBl,  I,l)**A,THENH*fHl 

201  IF  I,1),’6*T>€IIS*S+1:R*M*1 

202  IF  NIBtdt,  l,  i)«*U*THEllV«V+lsll»NM 

203  IF  HIBHBf,I,l)*'C*T)CNV*V»l:N*N*l 

204  IF  HIM1M, I,  l)»'-*T)CNV«V+l  «N«N+1 
203  60T04008 

210  IF  HIP»(M,I,l)«'A,TJfHS»S+l:N«N*l 

211  IF  HIP»(BB,I, l)»,6,'nCNN*H»l 

212  IF  HIM  (If,  1, 1 )  ••U*TJ£NV»V*1 :  N*N*1 

213  IF  HIP4(M,I,1)*,C*T)CNV»VM:N*N*1 

214  IF  HI0»<M,I,I)**-,T«NV»V*1:N*N*1 

215  6UT04008 

220  IF  HID*(M, I, l)«*A*T!CNV-V+l:N=N+l 

221  IF  HID4IBI,  I,1)**6*T1£NV»VM:N*N+1 

222  IF  HIM(BB,I,1)«'U*T}CNN*N*1 

223  IF  HIPI(M,I,1)**C*THENS*S*1:N=N+1 
24  IF  HIWlBt,I,l)’*-*THENV*V«T:N*N+l 
23  60T04008 

20  IF  H1P4IB4,1,1)**A*THENV=V+1:N=N-M 

21  IF  HIDHB4,I,1)«,6*THENV=V+I:N=N+1 

22  IF  HID4(M,I,l)**U*T}OS*S+l:N=H+l 

23  IF  HIBIlM.I.Ds'C'TlfNfhN+l 

24  IF  HIM  (Bl,  1, 1 ) s*-*THENV=V+l:  NrH+l 
22  60TD4008 

240  IF  HID4(B4,I,1)»*A,T)£NV*V*1:N=H+1 

241  IF  HIM  (B$,  1, 1 )  **G'T}CHV*V*1 : N=N+1 

242  IF  HID* (Bl,  1, 1 ) »*U*THEHV*V»1 : N=N*1 

243  IF  HI04(M,I,l)**C*T)€MV=V+l:NiN4t 

244  IF  HID! (B*f I, 1 )»*-*THEHN*H 
24  3  60104008 

273  LPRIHT* - 

26  UHIKI*  DETERHIHATIOH  OF  EVOLUTIONARY  DISTANCE  BY  KIHURA’S  ALGORITHM' 

27  LPRINT* - 

28  LPRINT:LPRINT:LPP.INT:LPRIHT:LPRINT 
2?  60T06 

3000  F0RI*BT0E 

3001  IF  HIMIA$,I,1)»*A*THEN  200 

3002  IF  HIB4(A4,l,l)=*6'THEH  210 

3003  IF  HI04(A»,I,1)*'U*TH£N  20 

3004  IF  H1MIA4, I,  U*'C'THEN  230 

3005  IF  HIMIAl.I.n-'-'THEN  240 
4008  NEXT1 

4017  P*S/H:D*V/N 

4018  KA*(l-(2tP)-0) 

401?  KB»(1-(2I0)) 

4020  KCHCAIKB 

4021  KD*S8R(KC) 

402  KE«LD6(KD) 

402  KN*-«E/2 
4024  SAM/kA 


ion 


1  ft£M  COMPUTATION  Of  Mnuc)  BY  THE  AL60R1THH  OF  K I MURA  JME  16:111  *90) 

2  REM  PR06PAM  WRITTEN  FOR  TRS  80  MODEL  100:  M.T.  MACDONELL.  DEPT  MICR0PI0L06Y,  UNIV.  MARYLAND 

3  CIS: PRINT' DUMP  HEADING  TO  PRINTER’ 

4  YI«INKEY4:JFYD«*,THEN4 

5  IFYI«'Y*THEN273 

6  DATA" 

10  I  AT  A" 

13  DATA" 

20  DATA" 

25  DATA" 

30  DATA" 

33  DATA" 

40  DATA" 

45  DATA" 

30  DATA" 

31  DATA" 

32  DATA" 

33  DATA" 

54  DATA" 

35  DATA" 

56  DATA" 

57  SATA" 

38  DATA" 

5®  DATA" 

60  DATA" 

61  DATA" 

62  DATA" 

63  DATA" 

64  DATA" 

63  DATA" 

66  DATA" 

67  DATA" 

68  SATA” 

6?  DATA" 

70  DATA" 

71  DATA" 

72  DATA" 

73  BATA-  - 

74  SATA" 

73  DATA" 

76  DATA” 

77  DATA" 

78  DATA" 

79  DATA" 

BO  REA0ALI,PAt,FLt,CH$,HAD,Fl$,HA4,VU1,NAD,AN1,PH4,HY$,Rf4,CAD,S04,SHI,Hll,Vl.D1PE*,VlD1AEI,ADD,AM,NR- 

*.  TYD.COD,  NND  CCD,  DM,  PPD,  FF4,  KH«,  Rfil,  NOD,  LLD,  PAD,  AM,  TTD 

103  CLS:PRlKTiPRINTiPRINT:PRIHT:PAlNr  I  CAPS  LOCK  ON  *•  :F0R3=*1T070: NEXT 

110  CLS»LINEI3,3)-(227, 18),  1,B 

111  T«0:N«0 

120  PRINT: PRINT*  ««  DETERMINATION  OF  Mnuc):  K1MURA  •»* 

123  PRINT 
128  B«1:E«14? 

130  INPUT'Afferwcf  Tixon'jOt 


3  CIS: LINE (60, HI- (196,24) ,  I,B:PRINT9?2, 'SEQUENCE  COMPAfilSON* 

10  60SUB1000 

15  INPUT*Te*t  strain  tixon'jTI 

16  PRINT’Input  •Tl*  sequence"; :INPUTB* 

20  IKEN(M) 

25  IKEN(II) 

29  C*0 

JO  IFf1>9(THEMH*t1 
35  F0RJ«1TW 

40  IFNIM(AM,l)»«ID*(WfI,l)T>CN45 

41  OOl 
45  NEITI 

50  CLS:PRI NT:PRINT'Di  Her  ence  Mtrji  eleeent*  ":LlNEU6l,6)-UB6,16),I,B:PR!NT«67,C 
55  I*  INT  (1 0001 HN-C) /Nil 

60  PRINTsPRINT"!  HoMlogy«":LINE(71(22)-(102,J2M,!:PRINTtl32,]t/10 
63  PRINT 

65  PRINT’Continue?" 

66  R»*IHKEYt 

67  IFRI*”THEN66 

70  IFRt*YTHENCLS:60T015 

71  IFW**Y’THE)CIS:60T015 
75  REHJ 

1000  PRINT: INPUT’Reference  striin  taxon’jR* 

1001  PRINT’Input  "RV  sequence";:INPUTAI 

1002  RETURN 


MM 


-4 


I 


1  REN  PR06RAH  WRITTEN  BY  H.T.  HACDONELL,  DEPT.  OF  MICROBIOLOGY  UNIV  OF  MARYLAND:  SEARCHES  FOR  ’RNY’ 
AND  'STOP'  CODONS  IN  EACH  OF  THREE  READING  FRAMES  OF  NUCLEIC  ACID  SEQUENCES 

2  REM  WRITTEN  IN  MICROSOFT  BASIC  FOR  TRS  80-MJDEL  100 

3  60SUB203 

10  PRINT  .’PRINT:  PRINT sPRINT:  PRINT:  INPUT* SEQUENCE *;A» 

11  M«LEFT»(A*,3) 

12  C«0 

13  A*0 
20  I«1 

25  MINIM) 

30  IFMIW(AI,X,1)»*A*THEN100 

31  IF  HIM(A«,Ifl)**6*THEN100 
40  I*X*3:IFX)ZTHEN120 

45  60T030 

100  IFNID»(A«,X+2,1)**C*THENA*A*1 

101  IF  MlD4(A*,X+2, l)r’U*THENA=A*l 

105  X*It3 

106  IFDZTHEN120 
110  60T030 

120  PP.INTA*’RKT’  sequences  in  True  *CM 
125  A»*HJDI(A*,2,Z-1) 

127  C=CM 
12B  IFC«3Tt€N130 
130  60T015 
150  60T01000 
205  SOUNDON 
20?  CIS 

210  L INE (20, 22) ~ (210,32), lfB 

220  PRINT  8  125,*  ’RNY’  It  ’STOP’  CODON  SEARCH* 

224  S0UW1500,  USOUND2510,  l JSOUWH500, 1:S011H02510, 1  ’.SCUK01500, 1  :S0UW)2510, 1 

225  FORM  T0900:  NEXT:  as 
230  RETURN 

1000  A*OjX*l:C*0:AI*B8+A$ 

1001  ZrL£N(AI) 

1005  IFMIDI(A0,I,3)S*UAA*THEN1100 

1031  IF  MlD*(A$,X,3)**UA6*THEN1100 

1035  IF  HmiA»,X,3)**U6A*THEN1100 

1040  IFDZTHEN1120 

1045  SOTD1003 

1100  A=A+1 

1103  X*I+3 

1106  IFX>!THEN1120 

1110  60TOI003 

1120  PRINTA*’STOP’  codons  in  True  *C*1 
1123  A*N!IDI(A<,2,Z-1) 

1127  C*C+I 

1128  1FO5THEN1150 
1130  A*0:I*I:GOT01003 

1150  PRINT’CtJNTINUE?* 

1151  KMNKEY8 

1152  IFKI***THEN113I 

1153  I FKi*’Y  * THENCLS : 60T01 0 

1154  ffNU 


4 


J 


9 


« 


4 


4018  2JKHR4 (2031:60104020 

4019  ZJ$*CHM(239):60T04020 

4020  LPRINTZJfjJ 

4054  NEXT) 

4055  LPR1NT 

4056  NEXT1 

4057  IIMT 'CONTINUE*  jZZ* 

4056  IFZZ*«'N'T)£NEND 
4059  CLEARiGOTOlO 

5149  PRINT 

5150  PRINT’CONTINUE?' 

5151  RWNKEYt:  IFR$*"THEN5151 

5152  IFW«'N'THEN4000 

5153  G0T0190 

10000  A^jW*” 

10001  IFW*''THEHA*M 

10039  1FA6*''THENPRINT'STRAIN  NOT  LOCATED  IN  DATABANK*:FORI«1T0500:NEXT!:60T0130 
11000  IFW**'THENB4*4 

11039  1FW«''THENPRINT'STRAIN  NOT  LOCATED  IN  DATABANK':F0f!I«lTO5OO:NEXTI:6OTOl311 

11040  RETURN 

20000  FORIMTOO 

20001  FORJ-1TOO 

20003  LPRINTGI(J,I);: 

20004  NEXT) 

20005  NEITI 

20006  END 


1  ten  BASIC  PR06RAN  FOR  DOT  MATRIX  ANALYSIS  OF  PALINDROMIC  ANC  REPEATED  SEQUENCES 
3  CLEAR: DIMS! (100, 100) :IPRINTCHR1(27)CHR4(77)CHR4(I4)  ilPRINTCHRI  (15)  :LPRINTCHRI(27)CHR4  (65ICHRK4) 
3  KM  M.T.  MACDONELL  1  R.R.  COLNELL,  DEPT  OF  MICROBIOLOGY,  UHIV  OF  MARYLAND 

6  DATA* 

180  READ 

181  CLStPRlNT'DTMX  DENSITY  MAP  PROGRAM-tPRINTiPRIMT* <1)  REPEATED  SE0UENCES*:PRINT*(2>  PALINDROMIC 
SEQUENCES* i  PR  I NT* (3)  COMPARE  TNO  SEOOEHCES** I  W*UT*CMOICE*jr 

190  I Ht/T* Identify  Reference  Sequence*}!!* 

191  1FK«1THENM*«*REPEATED  SEQUENCE  SEARCH*: R*«Q*:60T0193 

192  IFK»ZTHENM*«*PALINDROME  SEARCH*:R*»0*: 60T0196 

193  1FK«3THENW«** 

194  INPUTMdentify  Test  Sequence’jR* 

195  IHWOLIGOMER  DEFAULT  LENGTH* ;L 

196  PRINT: PRINT*Norking...*:60SUB10000 

197  IFWTHEN60T03000 

198  PRINT* DOT  MATRIX:  *Q*’  x  *RI* 

199  PRINT 

200  PRINT 

201  PRINT 

203  IFLENIAS)  M_EN( B$) THENO=LEN (Af ) : 90T0210 
206  (XEH(W) 

210  F0RI*1T00:F0R3*1T00 

220  IFMIWIAI, J,L)=MIDF(B*, I,L)TNEN6I(J,I)*1*52(J, I) 

230  NEITJ 

240  PRINTCHMI 13) ;:PRINT(ItlOO) "ARRAY  ELEMENTS  COMPLETED*} jNEXTI 
250  60T05149 

3000  0*L£N(A$):INPUT*0L ISOMER  DEFAULT  LENGTH*; L 

3001  M-*  * :  FORI=OTO  1  STEP- 1 

3002  Y*=HIDt(A*,  1,1) 

3003  B*«B*m 

3004  MEXT1 

3005  G0T0198 

4000  WUT’DUNP  TO  PRINTER* ;ZZ*:IFZZ*S*N*THEHEND 

4001  IFK*lTHENLPfiI NT ’REPEATED  SEQUENCE  DENSITY  HAP*:LPRINT:LPR1NT:LPRINT 

4002  IFK»2T*NLPRINT*PALINDR0MIC  SEQUENCE  DENSITY  HAP*:LPRINT:LPRlNT:LPPJNTi 

4003  LPRINT’REPRESENTATIVE  SEQUENCE  ■QttLPRINT:LPRINT:LPRINT:LPfiINT*  20  30  40 

50  60  70  80  90  100* 

4004  LPRINT*  ........ 

.  *  jLPRINTj  LPHIHT 

4005  LPRINT A*:LPR1NT:LPRINT:LPR1HT 

4006  FOR 1*1 TOO : FOR J  * 1 TOO 

40  07  IF6Z(J,I)>3THENSX(J,I)*5 

4008  IFBXIJ,I)*OTHEN4014 

4009  IF6X(J,I)*1THEN4013 

4010  IFBX(J,I)-2TH£N4016 

4011  IFBX(2,I)»3TH£N4017 

4012  IFBX(J,I)-4THEN401B 

4013  IF6X(J,I)*3THEN4019 

4014  ZJKHR* 1241)  :60TD4020 
4013  ZJI*CHRt ( 43) : 60TD4020 

4016  ZJ*«CHR*(l73):60T04020 

4017  Z <  174) I&0T04020 


1  REM  Progru  ‘DAT ft  MANAGER*  by  fl.T.  MecDontll  August,  1984.  Written  for  IRS  80  (1100  systet 

2  HAXFILES-2 
5  SOUNDON 

t  as 

10  IIK120, 201-1210, 35),  1,1 
20  PRINT  I  133, •DATA  MANAGER' 

24  lOUND15OO,ljSOUND231O,lj50UND13OO,ltSOUND25lO,l:SOUNDl5OO,ltSOUNO231O,l 
23  FORMT0900i  NEXTi  CIS 

26  PRIKTtLINE(68,6l~(l03,17),l,B:INPUT*READ  OR  NRITE'jLI 

27  IFLI»*'THEN26 

28  IFLI»'R*THEN2000 

29  IFLIOVTWCG 

30  aS:PRINT>G0SUB1000:UNE(  124,61  -(139, 171 , 1  ,Bt INPUT* INPUT  FILE  REQUIRED'jAl 
35  1FAIO*Y*THEN30 

40  CLS:GQSUB1000:PRINT:LINE(U8,6M163,17),l,B;INPUT*NAnE  OF  INPUT  FILE’;FII 

41  BPENFIIFOR  INPUT  AS  1 

42  TT*1 

30  CLS:G0SUB10O0:PRINTjLINE(124,6M169,17),1,B:INPUT*NANE  OF  OUTPUT  FILE'jFOl 
35  BPEJFOIFOR  OUTPUT  AS  2 

40  PRINT:60SUB1000:LINE1112,22)-(133,33),1,8: INPUT’NUHBER  OF  ENTRIES’ ;N 
65  PRINT:LINE(160,3B)-(173,49) ,1,B: INPUT'IS  DATA  NUMERIC  OR  STRIN6’jHI 

70  IFTMTHENIOO 

71  IFLEFTl  (Id,  1)*"THEN71 

72  IFLEFTl (Ml, 1)*'S’TNEN900 

73  IFLEFTl  (HI,  1  )< >*N'THEN71 

74  FORMTON 

73  GOSUBIOOO: INPUT'ITEH,DATA’| IT$,DA 
80  PRINTB2, ITI‘,'DA 
90  NEXT 

93  60SUB1000: INPUT’END'fAl 

96  IFA»«'Y'THENaOSE»l:aOSE»2:END 

97  60T09 

100  IFL£FTI(HI,1)«*’THEH65 

101  IFLEFTI(HI,1)«'S*THEN170 

102  IFLEFTl  (Nl,  IX  >’N*T)€N63 

103  FORMTON 
110  I>fUT»l,ITI 
115  PRINTITI 

120  60SUBlOOO:I*>UT’DATAa;DA 
130  PflINT»2,lTljDA 
140  NEXT 

130  GOSUBIOOO:  INPUT’END’fAl 

133  IFAI*’ Y'THENCLOSEI 1 i CL0SEI2 : END 

160  SDTD9 

170  FORMTON 

173  INPUT»1,1TI,DA 

1B0  PRINTITI, DA 

1B5  6OSUBIOOO1 INPUT ’DATA*; DAI 

190  PRINTI2,ITI,DAI 

193  NEXT 

197  GOSUBIOOO: INPUT’END’jAl 

198  IFAI**riHENCLOSE»l:aOSE»2:END 


900  FORWTQN 

905  60SUB1000:  INPUTMTEM, DATA";  IT*,  DA» 

910  PRINTI2,JT4’,'PAf 
920  NEXT 

950  MSUBlOOOl  INPUT 'END’iA* 

960  IFAt**Y*THENCLOSEtlsCLOSEI2sENO 
970  GOTO? 

1000  S0UND3OOO.3 

1001  RETURN 

2000  Cl5i60SUB1000sPftINT:LINE(124,6)-(169,17),l,B: INPUT'NANE  OF  OUTPUT  FILE’jFOO 

2003  OPElfOOFOR  INPUT  AS  2 

2010  FQRWTOIOO 

2020  INPUT02,IT*,DAI 

2023  PRINTITI,DA4 

2030  INPUT* (Key  to  continot  IE  to  tnd)*;Zf 
2033  IFZ0**E*THENCL0SEH:CL0SE»2:END 
2040  «T 


APPENDIX  B. 


The  following  are  sequences  of  5S  rRNAs  representative  of  each 
major  cluster  (figures  5  and  7).  Secondary  structure  model  employed  is 
from  MacDonell  and  Colwell  <1984d). 


1.  Photobacteri aw  1 eiognathi . 112 

2.  Vibrio  alginol yticus  .  .....  .  113 

3.  Plterowonas  patrif  aciens . 114 

4.  Vibrio  warinus  (MP-l) . 115 

5.  Vibrio  anguillaruw . 115 

6.  Vibrio  psycbroerytbrus  .  117 

7.  Qerowonas  hydrophi 1  a . 110 

8.  Plesiowonas  shi gel  1 o ides . 119 


11  1 


Vibrio  narinus  (MP-1) 


U  A 
C  G-A 
C  G-A 
A  U 
C  G 
:  A 


U  A 
G  C-A 
U  G'' 

A  U  G  C 

C  A  I  GU 

5'  C  G  G  G  A  U 

UGCCUGGCGA  CC  AU  GUGUGGGG  v 

ACGGACCGCU  „GG  UG  UGUACCCC  C 

U  'AAAAG  U' 

3’  C  GA 


</i  6i  o  angui 2 1  arum 


I 


uuc 

A  C 
C  G 

C  A 

C  A 

U  C 
A  U 
G  C 
U  A 
C  G-U 
C  G-A 
A  U 
C  G 
C  A 
C 

A  A 

G  A 
A  U 
U  G 
G  C 
U  A 
C  G 
G  U-A 
U  G' 

AC  G  C 

C  UA  |  GU 

5'  A  I  G  G  A  Av 

UGCUUAGCG  GCC  AU  GUGUGGG  G 

ACGAAUCGUV  CGG  UG  UGUACCC  U 

U  CA'  A  A  A  G  U' 

3’  GA 


Vibrio  psychroerythrus 


UGC 
A  C 
C  G 

C  A 

C  A 

U  C 
A  U 
G  C 
U  A 
C  G-A 
C  G-A 
A  U 
C  G 
C  A 
A 

A  A 

6  A 
G  C 
U  G 
G  C 
C  G 
C  G 
G  U~  A 
C  G 
A  U  G  C 

C  A  I  GU 

3'  C  G  G  G  A  A. 

UGCCUGGCGG  CC  AU  GUGUGGC  U 

ACGGACCGUC  GG  UG  UGUACCG  U 

U  'A  A"  A  A  G  U' 

3'  C  GA 


Qeroaonas  hydro  phi  1  a 


4 


UGC 
A  C 
C  G 

C  A 

C  A 

C  C 
A  U 
G  C 
U  A 
C  G-A 
C  G-A 
A  U 
C  G 


C  A 
C 

U  A 

G  A 
G  C 
U  G 
G  C 
G  C 
C  G 
G  U-A 
U  G' 

A  U  G  C 

C  A  I  GU 

5’  C  G  G  G  A  (J 

UGCCUGGCGG  CC  AU  GUGUGGGG  s 

ACGGACCGUC  GG  UG  UGUACCCC  C 

U  'A  A"  A  A  G  U' 

3'  C  GA 


u 


I 


I 


I 


I 


PI esiomonas  sbigel 1  aides 


I 


I 


I. 
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